Best 

Available 

Copy 


AD-  784  882 


DYNAMIC  ANALYSIS  OF  EXECUTION: 
POSSIBILITIES,  TECHNIQUES  AND  PROBLEMS 


Birol  Omer  Aygun 


Carnegie-Mcllon  University 


Prepared  for: 

Defense  Advanced  Research  Projects  Agency 
Air  Force  Office  of  Scientific  Research 


September  i  9  7  3 


DISTRIBUTED  BY: 


National  Technical  Information  Service 
U.  S.  DEPARTMENT  OF  COMMERCE 

5285  Port  Royal  Road,  Springfield  Va.  22151 


1 


S  t  C  1J  R  Tv  CL*1'/ 


'  '  *4  A  •  t  e  b»t *tr 


REPORT  DOCUMENTATION  PAGE 


-1H1M 

i I  |  n.UCTI'Y'FS 


1  fU  *'CM  T  ►< 

Afosfl-  if?  74  -  1  4  30 


HEFOK1:  rc?>,‘I.I.TIN(,  !  <  iRM 


4  T  I  T  V.  T  (end  Subtitle) 


DYNAMIC  ANALYSIS  OF  EXECUTION:  POSSI  HI  UTILS  , 
TECHNIQUES  AND  PROBLEMS 


Pf  firof  MING  ONG,  ru  PORT  NUMBER 


7  AuThUM(i; 


Birol  Omer  Ay  gun 


»‘E«*  "<1*  MG  GPr.  •»,*(  2  A  T  lOfrf  ti  AMt  AnD  AOOMf  SS 

Cnrocgi e-Mel  Ion  Univc  rsity 
Department  of  C<v  purer  Science 
Pitt,  burgh,  PA  i 5 1 3 


I  IC  PROORAM  EL  EVENT.  PROJFCT  ,~T  AS 
ARE  A  S  WORK  UNIT  RUMDLRS 


61101D 

A0827 


"  CCr1R:llir.  if|C[  I.AM(  Asn  ADDRESS 

Defense  Advanced  Research  Projects  Agency 
1400  Wilson  Blvd 
Arlington,  VA  22209 


W.  RETORT  DATE 

September,  1973 


14  M  .  M  T  O  R I R  J  AC.CSCV  rave  A  *DOHC  SSOf  ditlerrri  Irpm  Conlrolllnt  Olht.) 

Air  Force  Office  of  Scientific  Research/'./’? 

i  /.nn  i  enn  D  1  tt/l 


13  NUWUEROE  RAGES 

/9z> 


1400  Wilson  Blvd 
Arlington,  VA  22209 


16  Dl  S  T  R  |  U  \j  T I O  N  ST  AT  t  Ml  q  T  (ol  s  Keport) 


Approved  for  public  release;  distribution  unlimited. 


*7  DISTRIBUTION  STATEMENT  (of  the  ehstract  entered  in  Block  20.  It  different  trorr.  Report) 


IB  SUPPL EMEnTARy NGTFS 


19  KEY  WORDS  ( Continue  on  revert e  side  ti  necessary  end  Identify  by  block  number) 


NATIONAL  TrruMCAL 
INFORMATION  SERVICE 

u  r-of ■  f*  «•  -•  ♦  ro’  T»rte 

SpMnjr^W  VA  ?ril! 


0  AbST  RAC  T  f Continue  nn  reverse  ntde  if  necessary  and  Identify  by  block  number)  - 

The  problem  of  designing  computer  systems  which  are  far  more  helpful  to  the 
user  than  current  systems  in  dynamically  analyzing  program  behavior  is  studied, 
fb  functional  requirements  which  such  a  facility  must  meet  are  outlined.  The 
fundamental  objective  is  to  permit  the  user  to  analyze  a  program  in  terms  of  a 
user-defined  level  of  abstraction  suitable  to  his  particular  analysis.  A  pro- 
tot>  P  implementation  ’which  meets  most  of  the  requirements  is  described.  The 

implications  of  such  a  facility  for  machine  architecture  to  reduce  execution 
overhead  are  explored 


DD  I  jar  73  1473  EDITION  or  '  NOV  AS  IS  CBSOt  F  T  f 


UNCLASSIFIED 

SECURITY  Cl.  /  SSir  1C  AT  ION  3l  3  RlS  f>  ARE 


f 


DYNAMIC  ANALYSIS  OF  EXECUTION: 


Poss  I b  1 1  it  I es  ,  Techniques  and  Problems 

by 

Rirol  Omer  Aygiin 


Department  of  Computer  Science 
Carnegie-Me I Ion  University 
Pittsburgh,  Pennsylvania  15211 
September,  l‘J7l 


Submitted  to  Carnogie-Melion  University 
in  partial  fulfillment  of  t lie  requirements 
for  the  degree  of  Doctor  of  Philosophy. 


This  work  was  supported  by  the  Advanced  Research  Projects  Agency 
of  the  Office  of  the  Secretary  of  Defense  (F44620-70-C-0107)  and 
is  monitored  by  the  Air  Force  Office  of  Scientific  Research. 

This  document  has  been  approved  for  public  release  and  sale;  its 
distrubution  is  unlimited. 


i'  8 


CAR  NEC  I K-MFLLON  UNIVERSITY 
COMPUTER  SCIENCF  DEPARTMENT 

THESIS  ABSTRACT 

DYNAMIC  ANALYSIS  OF  EXECUTION: 
Possibilities,  Techniques  and  Problems 

b  v 

B i ro 1  Ome  r  Ay  Run 


The  problem  of  designing  computing  systems  which  are  far 
more  helpful  to  the  user  in  the  analysis  of  a  propram's  behaviour 
at  run-time,  than  current  systems  is  studied. 

By  considering  four  application  areas,  namely  debugging, 
flow  analysis,  performance  measurement  and  storage  reference 
pattern  analysis,  a  list  of  specifications  for  a  "general-purpose 
execution  analysis  facility"  (CPEAr )  are  drawn. 

A  prototype  facility,  called  DAME  (Dynamic  Analysis  and 
Modelling  Environment),  implemented  on  the  PDP-10  for  studying 
the  behaviour  of  PDP-11  programs,  is  described.  DAME  contains 
a  PDP-11/20  simulator  and  a  programmable  analyses  facility. 

It  Is  shown  that  DAME  satisfies  most  of  the  above  requirements. 

Significant  aspects  of  DAME  are:  (i)  Access  to  the  state 
of  the  PDP-11  at  memory  and  register  cvcle  level,  (ii)  A  flexible 
hook  mechanism  which  permits  arbitrary  analysis  computations  at 
ranv  points  in  the  instruction  cvcle,  (iii)  A  node  mechanism 
which  permits  the  user  to  define  over  his  program  a  "level  of 
abstraction  suitable  for  the  desired  analysis,  (iv)  A  comprehen¬ 
sive  instruction  set  for  analysis  procedures. 

The  node  mechanism,  perhaps  the  most  novel  feature  of  DAME, 
enables  the  user  to  define  at  run-time  a  set  of  "nodes"  in  his 
program,  in  terms  of  which  the  execution  will  be  monitored.  A 
node  is  a  portion  of  code,  viewed  as  a  "black  box",  having  unique 
entry  and  exit  points.  During  execution,  DAME  constructs  a  set 
of  the  inputs  and  the  outputs  of  each  occurrence  of  each  node. 

The  node  mechanism  pemits  backtracking  to  anv  point  in  the 
execution  history,  and  control  and  data  flow  analysis  at  node 
1  eve  1  . 

Five  detailed  examples  of  the  application  of  DAME  to  analyses, 
difficult  or  impossible  with  other  systems,  are  given.  Fxample  1 
illustrates  the  input/cutput  sets  of  nodes  and  accessing,  the 
previous  values  of  an  address.  The  PDP-11  program  used  in 


II 


Example  1  through  4  is  a  recursive  Quicksort  program.  Example  2 
illustrates  the  determination  of  the  transition  frequency  between 
nodes  and  Example  3  analyzes  the  parallellism  in  the  Quicksort 
routine  at  the  recursive  call  level  as  examples  of  control  flow 
analysis.  Example  4  illustrates  analysis  of  data  flow  between 
two  consecutive  nodes  by  comparing  the  output- set  of  the  first 
with  the  input-set  of  the  second.  Example  5  illustrates  a  proce¬ 
dure  for  t  n  e  analysis  of  the  instruction  mix  and  addressing  modes 
used  by  PDP-11  programs. 

The  present  performance  of  DAME  is  poor  due  to  simulation 
at  memory  cycle  level  and  checking  for  monitor  actions  at  every 
memory  and  register  access.  It  runs  100Q  to  20QQ  times  slower 
than  a  PDP-11/20  when  input/output  sets  are  not  used,  and  4000 
to  3000  times  slower  when  they  are.  Measurements  indicate  that 
respective  speed  ratios  of  300  and  2500  for  the  above  cases  are 
achievable  without  major  re-design. 

In  designing  analysis  facilities  for  ALGOL-like  languages, 
while  the  main  features  of  DAME  are  still  applicable,  other 
complexities  arise  (e.g.  scopes  of  variables,  recursion,  selec¬ 
ting  a  "unit  of  execution").  These  problems  and  some  approaches 
to  their  solution  are  illustrated  for  a  subset  of  the  BLISS 
1 anguage . 

To  be  economically  feasible,  systems  such  as  DAME  will 
require  assistance  from  hardware.  Microprogrammed  implementations 
of  hook  and  node  mechanisms,  involving  tag  bits,  associative 
table  searches  and  monitoring  for  special  bit  patterns  to  detect 
hooks,  are  studied.  Eev  problems  are  seen  to  be  access  to  the 
complete  state  of  the  monitored  machine,  interference  due  to 
resource  sharing  with  the  analysis  facility  and  scarcity  of 
microstorage. 


FOREWORD 


The  researcii  which  resulted  in  this  dissertation  may  be 


viewed  as  a  journey 
While  most  areas  in 
of  development,  the 
is  certainly  one  of 
beneficial  for  both 


through  a  neglected  area  in  computer  science, 
computer  science  are  in  very  primitive  stages 
area  of  -amic  analvsis  of  program  behaviour 
the  most  neglected  and  potentially  most 
programmers  and  users  of  computers. 


A  look  at  the  Table  of  Contents  will  show  the  reader  the 
manv  dimensions  of  this  problem  which  had  not  received  a  systema¬ 
tic  examination  up  to  now.  Thus,  the  dissertation  itself  may 
be  regarded  as  a  map  of  this  heretofore  neglected  region,  iden¬ 
tifying  its  major  components  and  the  relations  among  them.  Inevi¬ 
tably,  all  components  have  not  been  studied  in  the  same  degree 
of  detail.  However,  hopefully,  enough  detail  and  insight  have 
been  provided  for  the  crucial  parts  to  give  a  head-start  to  the 
worker  interested  in  designing  such  a  system. 


In  retrospect,  1  would  like  to  acknowledge  w i t h  gratitude 
the  contributions  of  manv  individuals  in  various  stages  of  the 
research  and  thesis  preparation.  Professor  David  Parnas,  a 
member  of  Chi'  Computer  Science  faculty  for  most  of  the  period 
over  whicf  this  research  took  place,  provided  valuable  advice 
during  the  formative  stages  of  the  research  and  during  an  earlier 
impletrenta  ion  of  a  monitoring  facility.  Professor  William  Wulf 
provided  both  general  guidance  and  specific  technical  contribu¬ 
tions  to  the  architecture  and  participated  in  the  evaluation  of 
that  facility.  He  also  took  part,  with  Professors  Jack  Me  Credie, 
Sam  Fuller  and  Mary  Shaw,  in  the  evaluation  of  the  thesis  proposal 
and  the  progress  of  the  research.  In  particular,  he  provided 
a  key  idea  in  Chapter  7,  which  deals  with  execution  analysis 
facilities  for  high-level  languages. 


Special  thanks  are  due  the  members  of  my  thesis  committee, 
Professors  Jack  Me  Credie  (Chairman),  Victor  Lesser,  Sam  Fuller, 
Raj  Reddy  and  Andrew  Wong,  for  generously  contributing  their 
time  to  tie  reading  and  discussion  of  the  dissertation,  for 
numerous  corrections  and  suggestions  for  improvements,  and  re¬ 
reading  the  revision. 


I  am  particularly  grateful  to  Prof.  Jack  Me  Credi°  for  the 
continuous  dialog,  guidance  and  support  he  provided  in  both 
technical  and  administrative  matters  related  to  the  research 
and  the  thesis.  While  the  members  of  the  thesis  committee  and 
others  have  contributed  much  to  the  technical  soundness  and  to 
the  form  of  the  presentation  of  the  thesis,  I  bear  the  sole 
responsibility  for  anv  errors  and  any  technical  or  editorial 
def iciencies  . 


I\ 


utilizing  Cools  dsvolojsi^b^cthsrt*  tha°kPleX  C°mpUter  ProRrain 
of  fellow  workers.  These  includ.  a’  are  due  to  a  "umber 

diversity  of  Oslo  In  OsL  .  Lundp’  nov  w,t*  the 

Mchard  Johnsson,  Joe  Newcomer  rh  ’  i,  °u  LevIn>  Madv  B«uer, 

WU,  Jerrv  Apperson  ~nd  n„i  Weln#tock’  Msrlo  nurbacci, 

•***•*  *"  w/aJSSlwrtSr'  °  ”  up"n  u,,<"  1  h— 

izzrpzv  »«>«*««  fot 

lor  four  yearn.  P  raclllt>  and  financial  support 

encouragement  ind^ppif H  f I.!?-"*1  fppreciation  for  the 

"«•  »—  .H.  to  pers.v.r.  through  th.  7 “h,ch  1  -« 

«raduat,  and  ,r.du.t*  stndv  and  ,  i  "S/:'1"1  of  »»««- 
is  to  mv  wife.  Cuzin,  Who  hflS  not  ’  "V  deepest  gratltue 

h*  *J“r'  several  time.  eve?  ,  h  U  Y  1  af  ‘  '  ed  r'-'VP.d 

leb,  but  also  sacrificed  na„v  ,ocl!7  \  ,holdIn*  •  full-time 

» ««"  on hruzxr-r- 

1  dn  n°'  k"°“  ‘h‘-  »ufd*rb.u":*j::i"0:;l;t;;d 


V 


ABSTRACT 


CONTENTS 


Pape 

I 


FOREWORD 


III 


CHAPTER  1  INTRODUCTION  AND  MOTIVATION  1 

1.1  Execution  Analysis  Defined 

1.2  Objectives  of  Thesis 

1.3  Major  Application  Areas 

1.3.1  Debugging 

,1.3.2  Flow  Analysis 

1.3.3  Performance  Measurement 

1.3.4  Storage  Reference  Pattern  Analysis 

1.4  State  cf  the  Art  in  Dynamic  Execution  Analysis  Tools 


CHAPTER  2  FUNCTIONAL  REQUIREMENTS  FOR  A  GENERAL-PURPOSE  10 

EXECUTION  ANALYSIS  FACILITY 

2.1  Debugging 

2.1.1  Cont  ro 1  Bugs 

2.1.2  Computation  Bugs 

2.2  Flow  Analysis 

2.2.1  Control  Flow 

2.2.2  Data  Flow 

2.3  Performance  Measurements 

2.4  Storage  Reference  Analysis 

2.5  Summary  of  the  Functional  Requirements 

2.5.1  Information  Requirements  of  the  Analysis  System 


VI 


^•5.2  Triggering  o'  Analysis  Actions 

2.5.3  The  Instruction  Set  of  the  Analysis  Facility 

2.5.4  External  Appearance  and  Miscellaneous  Useful  Features 

CHAPTER  3  THF  DAME  SYSTEM  30 

3.1  The  Underlying  Data  Structures 

3.2  The  Representation  of  the  PDP-11  in  the  PDP-10 

3.3  The  Time  - (, rain  of  Simulation 

3.4  The  Hook  Mechanism 

3.5  The  Node  Mechanism 

3.6  An  Outline  of  D  A  M »  Instruction  Set 

3.6.1  General  Purpose  Computation  Instructions 

3.6.2  Execution  Monitoring  and  Analysis  Instructions 

3.7  Various  Design  Issues  and  Un i mp 1 emen t ed  Ideas 

3.7.1  Pepresentation  of  -11  Core  and  the  Design  of  the 
Hook  Mechanism 

3.7.2  Scheduling  with  look-ahead 

3.7.3  "Blow-up”  Representation  of  the  Processor  Status  Word 

3.7.4  "Compilation"  of  Decoded  -11  Instructions 

3.7.5  Further  Compilation  of  DAMF  Instructions 

3.7.6  A  Limited-Pun  Complete-Trace  Feature 

CHAPTER  4  ILLUSTRATIVE  EXAMPLES  OF  SOME  APPLICATIONS  OF  DAME  61 

Example  1.  Nodes  and  Input/Output  Sets  of  a  0UICKS0RT 
Program 

Example  2.  Construction  of  Node  Transition  Matrix 

Example  3.  Analysis  of  Parallelism  in  the  Ql’ICKSOPT  Program 


JL 


it 


Page 


Mil 


Example  4  . 
Example  5  . 


Data  Flow  Between  Two  Nodes 


Analysis  of  Instruction  Mix  and 
Mode  Usape  by  PDP-11  Programs 


Addressing 


CHAPTER  5  A  PERFORMANCE  MODEL  FOR  DAME-LIKE  SYSTFMS 

5.1  An  Informal  Characterization  of  DAME-like  Systems 

5.2  A  Model  of  DAME-like  Systems 

5.3  The  Overhead  of  the  Node  Mechanism 

5.3.1  The  Overhead  of  Detecting  Node  Entry  and  Exits 

5.3.2  The  Overhead  of  I/O  S*t  Maintenance 

5.4  Measurements  of  the  DAME  System 

5.4.1  Performance  of  the  PDP-11  Simulator 

5.4.2  Node  Entry /Ex  it  Overhead 

5.4.3  Input/Output  Set  Overhead 


86 


CHAPTER  6  HIGH-LEVEL  LANGUAGES  FOR  EXECUTION  ANALYSIS 

6.1  Some  Human  Engineering  Issues 

6-4  High-Level  Data  Access  in  Execution  Analysis 

6.3  Continuous  Evaluation  of  Expressions 

6.4  Implementation  of  Continuously  Evaluated  Expressions 

CHAPTER  7  EXECUTION  ANALYSIS  FACILITIES  FOR  ALGOL-LIKE 

7.1  The  Added  Complexity  of  High-Level  Languages 
7-1.1  On  Increased  Syntactic  Complexity 

7.1.2  On  Increased  Semantic  Complexity 

/1,3  Technlnues^''  ^  ^  LanRuage  ImP ]  emen  t  a  t  i  on 


Q6 


1  08 


Viil 


7.2  Execution  Analysis  Facilities  for  Interpreter-based 
Languages 

7.3  A  Mini  Demonstration  Language 

7.3.1  Information  Accessible  by  the  MDL  Analysis  Facility 

7. 3. 1.1  Representation  and  Accessing  of  MDL  Execution 
History 

7. 3. 1.2  Access  to  the  Internal  State  and 
Generic  References  to  Expression  Sequences 

7. 3. 1.3  Access  to  MDL  and  MDLAF  Texts 


7  . 

.3.2 

Contact  Points  a. id  Hook  Insertion 

7  , 

.3.3 

An  Outline  of  the  MDL  Analysis  Facility 
(AFL) 

I  anp,uage 

CHAPTER 

8  ARCHITECTURAL  FEATURES  FOR  EXECUTION  ANALYSIS 

129 

8  .  1 

The 

Hook  Mechanism 

8 

.1.1 

Monitoring  with  W  > W 

H  0 

8 

.1.2 

Monitoring  with  W  =W 

H  0 

8 

.1.3 

Monitoring  with  W  <  W 

H  0 

8  2 

Implementation  of  the  Node  Mechanism 

00 

The 

the 

Interface  between  the  Analysis  Facility 
Central  Processor 

and 

8 . 4 

The 

Analysis  Facility  Processor  (AFP) 

CONCLUDING 

REMARKS 

14  4 

P  EFEPENCES 

147 

APPENDIX  A: 

DAME  USER  MANUAL 

150 

APPENDIX  B:  SYNTAX  OF  MDL 


181 


1 


CHAPTKK  I 

INTRODUCTION  AMP  MOT1V AT  ION 


1.1  Execution  Analysis  Defined 

As  the  impart  of  computer  technology  pervades  essentially 
every  aspect  of  contemporary  civilization  and  ns  we  relegate 
'rt  and  more  responsibilities  to  the  computer,  it  is  reason- 
ab’e  to  expect  that  programmers,  analysis  and  users  of  prog¬ 
rams  will  need  more  and  more  powerful  tor  Is  to  analyze  the 
behaviour  of  programs  they  are  concerned  with.  The  kinds  of 
analyses  one  can  immediately  think  of  include,  but  are  not 
limited  to,  debugging,  performance  measurement,  validation  and 
certification.  As  the  complexity  of  programs  grows  far  beyond 
the  ability  of  anv  one  individual  or  a  small  group  of  indivi¬ 
duals  to  completely  understand  and  predict  their  behaviour  at 
any  level  which  is  of  interest  fas  it  already  is  todav  with 
most  large  programming  systems),  the  need  for  better  tools  to 
answer  questions  about  the  workings  and  the  behaviour  of  prog¬ 
rams  grows  proportionately.  I  shall  use  the  term  "execution 
analysis"  to  include  any  incuiry  into  the  behaviour  of  a  program, 
normally  in  a  specific  class  of  environments.  I  shall  leave  the 
word  program"  undefined,  reiving  on  its  intiitive  meaning, 
except  to  require  that  the  analyst  be  able  to  identify  what 
i,  to  be  considered  as  a  part  of  a  program  and  what  is  not.  I 
shall  further  concentrate  on  the  execution  of  programs  on  com¬ 
puters  similar  in  basic  architecture  to  those  in  most  common 
use  today,  i.e.  in  which  each  processor  has  a  single  instruc¬ 
tion  stream  and  addresses  a  linear  primary  memory,  at  least  as 
seen  by  the  programmer.  By  execution  analysis,  then,  more 
specifically,^  shall  mean  Inquiries  into  the  machine  states, 
and  the  relationships  among  machine  states,  which  are  evoked 
by  a  particular  set  of  executions  of  a  particular  nrogram  on 
such  a  machine. 

1.2  Objectives  of  The  sis 

The  main  objective  of  the  thesis  is  to  report  on  a  research 
project  into  the  design  of  environments  which  would  facilitate 

a  very  broad  range  of  execution  analyses.  Of  particular  interest 
are: 


(O  representation  of  execution  history  information  in 
a  manner  which  facilitates  the  introduction  of  high-level 
constructs  for  describing  and  carturing  diverse  aspects  of 
program  behaviour, 


2 


(  i  i  )  a  particular  set  of  such  constructs  which  provides 
i  kernel  for  n  large  class  of  analyses, 

(  i  i  i  )  extendabilitv  of  the  provided  set, 

(iv)  a  general-purpose  programming  facility  for  sensing 
(arbitrarily  complex)  conditions  and  taking  associated  actions 
at  any  point  during  the  execution  the  prog,  ram  under  analysis  . 

In  addition  to  these,  I  shall  consider,  in  less  detail, 
the  extension  of  the  presented  ideas  to  high-level  languages 
ind  the  architectural  implications  for  machine  design  arising 
from  t  hi  e  m  . 

1.3  Major  Application  Areas 

Although  questions  related  to  execution  analysis  pervade 
every  area  of  computer  science  and  technology,  for  the  purposes 
of  concreteness,  I  shall  select  and  examine  in  detail  several 
of  the  more  prominent  ones.  The  objective  of  this  examination 
will  be  to  arrive  at  a  set  of  functional  requirements  for  an 
analysis  facility  which  will  substantially  facilitate  such 
analyses . 

1_ .3.1 _ Dejh  ugg  i  n g 

I  do  not  wish  to  dwell  unnecessarily  on  the  fundamental 
importance  of  the  debugging  problem  and  its  magnitude.  Let 
a  quote  by  J.  T.  Schwartz  from  a  recent  symposium  on  debugging 
systems  Kl'  1971]  suffice.  "Normally,  at  the  beginning  of 
the  debugging  process,  even  a  programmer  with  some  past  expe¬ 
rience  can  never  believe  how  bad  things  are  really  going  to  be 
before  the  end."  Schwartz  also  gives  a  thoughtful  exposition 
of  the  classes  of  bugs  piauging  the  field.  Here,  I  shall 
follow  roughly  his  approach  to  present  a  taxonomy  of  debugging 
problems . 

Let  me  first  delineate  the  field  of  "bugs"  from  all  o  t  h  ?  r 
forms  of  error  In  programs:  specifically  excluded  are  (i) 
syntax  errors,  (ii)  errors  due  to  total  incompetence  on  the 
part  ol  the  programmer;  that  is,  if  a  program  is  so  far  off 
in  design  and  implementation  from  performing  its  intended 
function  that  it  would  have  to  be  completely,  or  almost  comple¬ 
tely,  rewritten  to  make  it  work,  I  shall  not  consider  such  a 
circumstance  a  "bug".  Thus ,  I  shall  consider  an  error  a  bug 
if  and  only  if  it  can  be  corrected  by  changing  a  small  part 
of  the  total  program,  although  it  may  well  have  taken  quite  a 
1  c>  n  g  time  to  isolate  it.  This  limitation  I  place  on  the  kinds 
of  errors  I  shall  call  bugs  is  necessary,  because  otherwise  one 
runs  i  r.  I  u  the  strongly  unsol  vat.  1  c  p  r  o  b  1  cm  of  proving  (or  disproving) 


the  equivalence  of  algorithms  in  general. 

The  first  class  of  bugs  I  shall  mention  (following 
Schwartz;  are  those  due  to  losing  count  of  things:  e.g. 
exceeding  array  bounds,  one  too  many  or  one  too  few  iterations 
in  a  loop.  Another  common  class  is  due  to  omission  of  initiali¬ 
zation  of  variables,  "manifest  in  situations  in  which  things 
ail  to  have  either  the  initial  or  terminal  value  which  a 
programmer  expects".  These  are  examples  of  bugs  which  usually 
manifest  themselves  in  the  early  stages.  In  later  stages, 
situational"  bugs  become  apparent  in  the  interfaces  between 
re  atively  distant  parts,  i.e.  in  the  assumptions  those  parts 
make  about ^eac h  o t her  (following  D .  Parnas's  definition  of 
interface").  "Semaphore  bugs"  and  timing  bugs  also  generally 
fc  ong  to  this  class,  their  existence  being  due  to  lack  of 
cooperation  among  the  various  parts.  Also,  in  this  large  class, 
is  the  set  of  bugs  due  to  not  reading  the  language  or  svstem 
reference  manual  properly  or  to  errors  in  such  publications, 
especially  with  respect  tj  services  provided  by  the  system,  e.g. 
meaning  of  a  particular  bit  combination  in  a  device  control 
register  or  the  parameter-passing  conventions  for  a  system  macro. 

Schwartz  next  considers  "various  aspects  of  the  habitat 
o  bugs  .  In  languages  which  permit  the  use  of  pointers,  in 
particular  assembly  and  higher-level  implementation  languages, 
one  often  transfers  off  to  nowhere  or  begins  writing  into  some 
strange  place  .  (This  author  has  received  countless  "illegal 
memory  reference  messages  from  the  operating  system  during  the 
evelopment  of  this  project  and  wished  there  were  a  debugging 

tHe  facllitles  desceibed  in  this  report,  although 
e  did  have  access  to  some  of  the  better  debugging  facilities 
provided  by  current  systems). 

p  c  1 " ,  hlS  thesis  entitled  "The  Debugging  of  Computer  Programs", 
K.  btockton  Gaines  provides  a  more  structured  taxonomy  of  bugs, 
which  is  summarized  below: 

1-  Point  of  origin  in  the  programming  process:  that  is, 
wnether  the  bug  arose  in  the  formulation  of  the  program,  or 
during  its  implementation... 

2-  Whether  in  data  definition  or  data  manipulation.,. 

3-  Control  and  Computation  bugs... 

4  Bugs  resulting  from  lack  of  knowledge  or  misunderstanding 
of  features  of  the  operating  environment... 


5- 


Fatal  and  non-fatal  bugs... 


6-  The  point  at  which  the  bug  may  be  detected.  Some 
may  be  detected  automatically,  (that  is,  in  a  purely  mechanical 
fashion  bv  checks  ,  rovided  in  the  compiler  or  in  the  generated 
code  or  operating  system),  while  others  can  only  be  found  by 
intelligent  activity  on  the  part  of  the  programmer." 

The  most  characteristic  feature  of  the  debugging  activity 
is  the  search  for  the  cause  of  an  unexpected  program  behaviour 
which  has  just  been  observed.  If  one  attempted  to  diagnose  the 
observed  anomaly  through  the  examination  of  a  voluminous  set 
of  unstructured  execution  trace  data,  one  would  often  have  an 
insurmountable  task.  Hence,  the  aim  of  debugging  tools  is  to 
narrow  down  the  amount  of  data  to  be  looked  at  "with  the  intent 
ot  locating  an  operation  transforming  reasonable  arguments  into 
an  unreasonable  result"  (Schwartz) .  The  risk  involved  in  reducing 
the  amount  of  data  collected,  of  course,  is  the  possibility  of 
leaving  out  some  important  information  about  the  program  behaviour 
which  could  lead  to  the  isolation  of  the  bug.  Hence,  given  a 
debugging  system  wrhich  the  user  can  direct  to  collect-  certain 
kinds  of  information,  one  measure  of  the  power  of  the  debugging 
system  is  the  degree  of  precision  with  which  the  user  can  specify 
what  kinds  of  data  he  wants  collected.  Other  measures  are  the 
ease  with  which  the  user  can  state  his  specification  and  the  time 
between  the  iterations  of  a  debugging  step. 

Normally,  the  first  aim  of  the  programmer  in  t i. e  debugging 
process  is  to  put  bounds  on  the  portions  of  execution  history 
involving  improper  program  behaviour.  This  requires  an  ability 
to  move  back  and  forth  easily  in  the  execution  history,  to  observe 
the  data  flow  as  a  function  of  control  flow  and  vice  versa. 

This  brings  us  to  the  area  of  flow  analysis,  which  has 
applications  in  many  areas  beside  debugging. 

1.3.2  Flow  Analysis 

By  "flow  analysis"  of  a  program  P,  I  shall  mean  inquiries 
into  the  relations  between  sequences  of  machine  states  which 
arise  during  a  set  E(P)  of  executions  of  P.  The  set  E  may  be 
small,  or  large  enough  to  be  considered  infinite.  For  example, 
consider  a  sorting  program  S(a,b)  whose  parameters  a  and  b  are 
the  starting  and  ending  addresses  of  a  vector  of  integers  to  be 
sorted.  Then  the  set  of  all  executions  of  S(a,b)  for  all 
a  =  0 ,...,[ n/ 2  ]-  and  b  =  [ n/ 2 ]+,..., n  where  n  is  the  number  of  core 
locations  in  user  address  space,  is  essentially  infinite. 

(By  [k]  and  [k]+>  I  denote  the  "floor"  and  "ceiling"  of  k 
respec  t ive  lv . ) 


5 


A  typical  problem  in  flow  analysis  is  determining  the 
set  of  all  successors  of  every  node  in  the  program  and  the 
associated  transition  probabilities.  The  result  is  normally 
expressed  as  an  mxm  matrix  M  where  M(i,j)  is  the  probability 
^  node  j  being  the  next  node  if  the  current  node  is  i.  This 
problem  can  be  extended  to  the  determination  of  the  set  Q(i,k) 
c.  all  k-node  sequences  following  no.Ie  i.  This  extended  problem 
I7'^'V  .  „re^a  r  ^  ec*  as  t*ie  determination  of  ”p  a  t  h- t  r  avo  r  s  a  1  probabi¬ 
lities  and  requires  somewhat  more  elaborate  machinery  than 
the  determination  of  single  step  transition  probabilities.  This 
-lass  of  models  of  control  flow  is  generally  called  "Markovian 
models"  and,  under  certain  assumptions  of  independence  of  past 
program  behaviour,  vield  probabilistic  information  about  patterns 
of  behaviour.  Coupled  with  estimates  of  CPI  usage  at  each  node, 
these  models  can  give  resource  usage  estimates  over  arbitrarily 
long  p  a  t  h  s  . 


Another  example  of  flow  analysis  is  that  used  in  predicting 
the  degree  of  parallel lism  which  can  be  obtained.  In  t  n : s  case, 
one  tries  to  determi^,  which  parts  of  a  program  could  be  run 
independently  of  which  other  parts  and  at  which  point  in  the 
execution.  In  general,  parts  which  do  not  communicate  at  all 
-an  be  run  in  parallel.  The  problem  often  becomes  one  of  decom- 
p<sing  the  program  into  pares  which  either  do  not  communicate 
at  all  or  whose  communication  permits  synchronization  of  their 
execution  while  still  maintaining  a  substantial  amount  of  over¬ 
lapped,  parallel  execution.  The  determination  of  "communication" 
between  two  parts  of  a  program  can  be  very  difficult.  For  example, 
whether  or  not  part  A  "tells  anything"  to  part  B  may  be  input- 
dependent  in  a  complex  way.  I  shall  call  this  problem  the  "data 
flow  problem.  Data  flow,  together  with  control  flow,  constitutes 
the  essence  of  a  program's  logical  behaviour.  I  shall  deal  with 
this  problem  at  length  in  the  rest  of  the  thesis. 

1.3.3  Performance  Meas u  r  e  m  e  n  t 


Two  types  of  performance  measurement  have  already  been 
discussed  under  Flow  Analysis.  Another,  more  general  type  of 
performance  measurement  problem  is  the  timing  of  arbitrary  paths 
through  a  program.  We  may  wish,  for  example,  to  define,  several 
paths  through  a  program;  start  timing  when  one  of  these  paths 
is  entered  and  stop  timing  when  the  executicn  deviates  from  it. 
We  may  decide  to  keep  or  discard  measurements  of  partial  traver¬ 
sals  of  a  path.  We  may  further  wish  to  start  the  measurement 
activity  only  after  certain  events  have  happened;  e.g.  after 
a. certain  routine  has  been  called  a  fixed  number  of  times.  This 
might  be  desirable,  for  example,  in  evaluating  the  tire  spent 
in  a  space-management  routine  only  when  it  is  called  in  the 
"main-loon"  portion  of  a  program,  after  the  initial  allocation 


6 


of  space  lias  been  made.  Thus  we  need  control  over  which 
paths  are  to  be  measured  and  under  what  corditions  they  are 
be  me a sured . 

in  timing  a  program  P  running  in  a  time-sharing  environment, 
the  other  programs  running  concurrentlv  wich  P  have  a  certain 
amount  of  effect  on  the  measurements  on  P.  As  an  example,  in 
some  time-sharing  systems,  the  overhead  for  handling  an  interrupt 
is  charged  to  the  program  which  was  running  when  the  interrupt 
came  in,  which  is  not  necessarily  the  one  to  which  the  interrupt 
1  1  1 c  should  possible  in  a  general-purpose  execution 

analysis  facility  to  measure  accurately  the  time  taken  by  a 
pr'gram,  as  wul  as  usage  of  other  system  resources,  e.g.  main 
storage.  This  brings  us  to  a  class  of  analyses  which  are  t  .  a  d  i  - 
cionallv  done  by  post-mortem  processing  of  a  tape  file  containing 
the  sequence  of  addresses  generated  by  means  of  a  hardware  prole 
during  the  execution  of  the  program  whose  behaviour  is  under 
analysis.  These  analyses,  which  are  especially  important  in 
paged  systems,  are  called  "Storage  Reference  Pattern  Analyses" 
and  are  discussed  further  in  the  next  sub-section. 


1.3.4 


Storage  Reference  Pattern  Analysis 


For  the  analysis  of  programs  from  a  paging  point  of  view 
one  can  identify  several  major  variables:  the  hardware  (in 
particular  page  size  and  the  paging  store),  the  operating  system 
in  particular,  its  paging  policies),  the  system  load  and  the 
particular  program  we  are  analyzing  (in  particular  page -r e f e r e nc e 
patterns).  In  theorv,  it  is  possible  to  hold  one  or  more  of 
these  variables  constant  and  vary  the  others.  However,  in 
practice,  one  most  often  has  to  hold  at  least  the  first  two 
constant,  live  with  uncontrolled  variations  in  the  third  and 
try  to  improve  the  fourth. 

It  is  alwavs  beneficial  to  analyze  the  static  reference 
pattern  of  a  program  from  the  program  text.  One  can  achieve  a 
certain  amount,  perhaps  a  great  deal,  of  improvement  this  way. 
However,  in  general,  the  storage  reference  pattern  is  a  function 
o  the  inputs.  Therefore,  one  needs  to  gather  dynamic  information 
on  the  page-reference  behaviour  on  various  parts  of  one’s  program. 
There  is  no  easy  way  to  get  this  information  at  present.  There 
are  hardware  devices  for  measuring  all  the  page  references  in 
a  computer  system  over  a  specific  interval  of  time.  Not  only 
are  these  devices  hard  to  get  and  make  routine,  practical  use 
of,  they  are  also  noc  dynamically  controlled;  so,  it  is  not 
possible  to  monitor  only  a  specific  part  of  a  program.  Clearly 

a  more  flexible  tool  for  obtaining  and  analyzing  this  data  i* 
r  r  e  d  e  d  . 


— - . . . 


7 


In  this  sub-sertion,  I  have  discussed  four  areas  (namelv 
debugging,  flow  analysis,  performance  measurement  and  storage 
reference  pattern  analysis)  for  the  application  of  execution 
analysis  techniques  described  in  this  thesis.  In  Chapter  4,  I 
shall  take  specific  problems  from  these  four  areas  and  illustrate 
the  usage  of  the  prototype  software  facility  DAME  in  solving 
them.  In  the  next  sub-section,  I  shall  survey  the  state-of-the-art 
in  execution  monitoring  facilities. 

1.4  State  of  the  Art  in  Dynamic  Execution  Analysis  Tools 

The  first  remark  one  can  make  regarding  the  state  of  the 
art  in  this  area  is  that  it  is  almost  non-existent  outside  t  b 
sub-area  of  debugging  tools.  A  further  indication  of  the  state 
of  the  art  in  execution  analysis  is  the  fact  that  all  the  on-line 
debugging  tools  which  have  come  to  this  author's  attention  cola  Id 
be  us<d  fur  various  other  types  of  execution  analyses  but  thep 
hardly  ever  are:  in  fact,  with  niner  extensions  they  could  tje 

make  iito  quite  useful  execution  analysis  tools.  The  softwar'e 
techno  ogv  has  failed  to  properly  utilize  even  its  existing 
tools  in  execution  analysis.  This  could  be  attributed  to  their 
being  labelled  "debugging  aids"  as  well  as  to  not  requiring 
meaningful  execution  analysis  data  from  programmers  in  the 
industry  as  well  as  in  the  universities.  At  any  rate,  the  following 
list  is  representative  of  the  types  of  machine  language  debugging 
facilities  found  in  a  j  o  r  sys.ems:  ( e  .  g  .  see  LBO  681) 

1-  Setting  and  removing  breakpoints  at  arbitrary  points 
inaprogram, 

2  -  Computing  arbitrary  functions  of  the  state  of  the  usef- 
addressable  core  at  a  breakpoint, 

3-  Referencing  core  symbolically, 

4-  Transferring  control  to  an  arbitrary  core  location  at 
a  breakpo int  , 

5-  Calling  other  debugging  procedures, 

6-  Modifying  contents  of  core, 

7-  Defining  symbols  private  to  the  debugging  svstem  and 
using  them  as  normal  identifiers  <n  debugging  procedures, 

8-  Specifying  automatic  collection  of  the  values  of  specific 
locations , 

9-  Directing  dumps  and  traces  to  u s e r - s p e c i f i ed  devices. 


8 


As  mentioned  earlier,  these  abilities  form  a  basis  upon 
which  more  useful  analysis  systems  could  be  built.  However, 
the  use  of  these  facilities  appears  to  have  remained  largely 
in  debugg ing  . 

I  would  now  like  to  mention  several  systems  whose  facilities 
have  gone  in  somewhat  different  directions.  In  [ST  65] 

T.  G.  Stockham  describes  a  graphical  debugging  system  which  has 
the  ability  to  interact  with  the  user  during  the  course  of  th° 
execution  in  terms  of  the  flowchart  of  the  program,  a  significant 
advance  in  man/machine  communication.  D.  U.  Wilde  wrote  a 
program  which  statically  charted  the  data  and  control  flow  in 
IBM  7090  programs  and  attempted  to  construct  functional  expres¬ 
sions  relating  the  interacting  variables  [WI  67],  Its  main 
limitation  was  limiting  itself  to  static  analysis.  More  recertly, 
the  Symbol  computer  developed  by  Fairchild  Corporation  permits 
the  user  to  specify  a  routine  to  be  activated  when  user-specified 
locations  are  accessed.  This  ability  forms  the  basis  for  signifi¬ 
cant  advance  in  execution  monitoring  and  analysis  capabilities. 

A  similar  feature  was  described  for  an  Algol-like  language  by 
J.  Me  Neley  [MCN  68].  R.  Balzer's  EXDAMS  system  [BA  67],  though 
it  never  became  operational,  made  a  attempt  to  provide  high-level 
facilities  for  obtaining  traces,  extracting  information  from  it 
and  displaying  it  flexibly  on  a  display  tube  and  running  the 
program  forwards  and  backwards.  The  work  by  T.  Cheatham  and  his 
associates  at  Harvard  toward  a  software  laboratory  included 
components  for  monitoring  the  value  of  user  specified  predicates 
while  a  program  runs  on  a  real  machine  or  a  software  interpreter 
of  machine  language. 

Various  mac h i ne - s imu 1  a t o r  based  debugging  systems  1 a”e 
been  built  and  reported.  The  MIMIC  system  described  by  R.  Supnik, 
the  AIDS  system  of  R.  Grishman  and  the  HELPER  system  described 
by  H.  Kulsrud  are  some  good  examples  of  such  systems.  (See 
RU  71]  for  reports  on  these  systems.)  The  main  limitations  of 
these  systems  are:  (i)  The  very  limited  amount  and  types  of 
computational  power  which  they  appear  to  have  been  designed  to 
provide,  (ii)  They  offer  no  higher-level  unit  of  programming 
than  individual  instructions,  e.g.  the  effect  on  the  machine 
state  of  five  consecutive  instructions  storing  into  the  same 
location  would  be  recorded  as  five  seperate  entries  and  there 
is  no  way  to  change  this;  thus  they  very  quickly  run  into  main 
storage  problems.  The  ability  on  the  part  of  the  user  to  define 
a  global  structure  over  his  program  containing  elements  of  arbitrary 
size  ana  to  be  able  to  capture  arbitrary  information  about  data 
and  control  flow  between  these  elements  would  greatly  facilitate 
the  analysis  of  interesting  questions  about  the  program's  run-time 
be  hav i o  u  r . 


9 


Another  class  of  systems  which  is  relevant  to  this  topic 
is  the  so-called  "virtual  machines".  (For  a  set  of  papers  on 
virtual  machines,  see  TACM  73].  However,  in  virtual  machines 
reported  so  far,  no  analysis  facilities  or  features  for  user 
control  of  the  computation  s ign i f i c an t 1 v  better  than  the 
breakpoint-oriented  debugging  facilities  (such  as  PCS  L  BO  6  8  J ) 
of  interactive  systems  have  been  described. 


T 


10 


CHAPTER  2 


ti^CTIj3^AL_ .RfcflU  I  REMEN'TS  FOR  A  G  EN ERAL-P 1J P POS  F.  EX E CU T  ION 

ANALYSIS  FACILITY 


In  this  chapter,  I  would  like  to  review  and  classify 
the  functional  capabilities  required  to  accomplish  the 
classes  ot  tasks  outlined  in  Chapter  1  and  to  arrive  at  a 
set  ot  functional  specifications  for  what  can  truly  be  called 
a  general-purpose  execution  analysis  facility"  (GPEAF). 

Bv  functional  specification"  I  mean  that  only  "what"  is 
wanted  is  to  be  specified,  leaving  the  method  of  implemen¬ 
tation  open.  Having  made  this  statement,  let  me  violate 
it  just  once,  in  order  to  give  a  lot  more  concrete  context 
f,  what  is  to  follow:  If  we  view  the  kinds  of  analysis 

tasks  which  have  been  mentioned  as  points  in  a  space  of 
infinitely  many,  continuous  dimensions,  then  the  set  of 
functional  capabilities  of  a  GPEAF  can  be  viewed  as  a  set 
of  primitive  operators  and  data  structures  which,  when 
used  in  composition,  juxtaposition  and  iteration  in  normal 
programming  style,  permit  one  to  proceed  easily  to  most  points 
in  that  space.  This  statement  will  serve  as  a  qualitative 
specification  of  the  overall  function  of  a  GPFAF. 

2.1  Debugging 

Let  us  first  consider  the  kinds  of  questions  that 
arise  most  commonly  in  the  debugging  process.  Recalling  the 
types  of  classifications  of  bugs  given  in  sub-section  1.3.1 
probably  the  most  promising  breakdown  is  into  "Control" 
bugs  and  "Computation"  bugs.  I  do  not  wish  to  imply  that 
be  eve  these  two  classes  are  independent,  rather  simply 
that  in  thinking  over  all  the  hours  (days,  years)  I  spent 
debugging  programs,  it  seems  that  a  great  deal  of  those 
bugs  could  be  comfortably  placed  into  one  of  these  two  classes. 

2.1.1  Control  Bugs 


Control  bugs  most  often  appear  locally,  in  the  form 
o  errors  in  conditional  branch  statements  or  in  the  number 
0  iterations  in  a  loop.  While  the  actual  nature  of  each 
error  in  control  flow  is,  of  course,  specific  to  the  particular 

action  one  would  like  to  take  to  diagnose 
to  be  able  to  say  something  like: 

X  and  the  machine  state  is  Y,  take 
if  the  next  state  is  Z  ".  Here,  X 

i 

list  of  calls 


program,  the  kind  of 
such  a  bug,  w o u 1 u  be 
If  I  have  just  done 
diagnostic  action  D 

i 

may  take  the  form  of  a 


on  subroutines  possibly 


With  specific  values  or  with  a  specific  relation  over  the 
values,  instruction  addresses  or  other  partial  specifications 
of  the  instruction  with  specific  operand  addresses  or  values. 
Example:  After  SI  has  called  S2  twice  with  parameters 

A,  B  and  C  such  that  A>B>C,  followed  by  n  calls  on  S3  by 
S2,  followed  by  two  MOVEs  to  location  k,  do...".  Note 
that  the  word  follow"  may  be  taken  in  its  strict  sense, 
i.e,  ^immediately  follow",  or  simply  to  mean  "come  sometime 
after  .  further,  it  is  unclear  whether  the  diagnostic 
action  is  to  be  taken  only  on  the  first  occurrence  of  the 
specified  condition  or  on  all  its  occurrences.  It  should 
be  possible  to  formulate  all  of  these  alternative  interpre¬ 
tations.  Also  note  that  the  specifications  over  the  actual 
parameters  of  a  subroutine  call  require  that  the  analysis 
facility  be  able  to  determine,  or,  failing  that,  be  told 
y  the  user,  the  locations  of  the  parameters.  The  machine 
state  specifications  Y  and  Z  are  partial  predicates  involving, 

i 

possibly  complex,  functions  over  the  state  of  the  memory, 
including  any  general-purpose  or  device  registers. 

The  diagnostic  action  D  may  involve,  minimally,  sus- 

i 

pending  the  execution  and  displaying  certain  elements  of 
core.  In  addition,  we  may  wish  to  compute  the  value  of 
a  function  and  store  or  display  its  result,  automatically 
continue  execution  from  the  same  or  a  different  point,  or 
we  may  wish  to  backtrack  to  an  earlier  point  in  the  execution 
history.  This  last  requirement,  namely  backtracking,  involves 
two  parts: 

1-  The  specification  of  the  point  B  to  which  we  wish 
to  backtrack,  and  the  associated  search  over  execution 
history i 

2-  The  actual  backtrack  operation. 

Having  made  a  backtrack,  one  may  wish  to  execute  a 
certain  number  of  instructions  and  jump  forward  to  an  inter¬ 
mediate  state,  and  eventually  resume  execution  from  the 
point  where  the  original  backtrack  command  was  issued. 

Note  that  the  form  of  the  debugging  request  given 
above  does  not  cover  predicates  involving  the  time-series 
of  the  values  of  a  location,  e.g.  those  of  a  variable  whose 
value  is  modified  in  each  iteration  of  a  loop.  This  leads 
to  the  general  concept  of  the  time-series  of  the  values  of 
a  variable  -  which  appears  to  be  a  natural  and  useful  construct 
for  debugging.  I  shall  refer  to  this  as  the  "value-trace" 
of  a  variable.  The  number  of  values  to  be  kept  should 
be  user-specified. 


Inder  this  title  I  shall  include  errors  in  "formulas", 
generally  characterized  by  a  sequence  of  arithmetic  operations 
concluded  by  an  assignment.  They  are  distinguished  from 

ntrol  Bugs  by  the  trivial,  localized  control  flow  involved. 

In  this  class  of  bugs,  we  are  concerned  with  the  past  and 
current  values  of  variables  as  well  as  the  new  values  to 
be  assigned  to  them  in  a  particular  instruction  or  set  of 
instructions.  Note  that  some  of  these  values  mav  in  fact 
ft  addresses  of  indirect  operands.  Hence  we  are  interested 
in  all  the  operands  (including  intermediate  pointers,  side- 
effects  such  as  setting  of  the  condition  code  and  automatic 
incrementation  or  decrementation  of  registers)  involved  in 
an  instruction  as  well  as  their  relation  to  the  instruction, 
ior  example,  we  want  to  know  not  only  that  instruction  1 
fetches  something  from  address  A  but  also  whether  A  is  the 
eventual  source  operand,  a  pointer  to  the  eventual  source 
operand,  the  eventual  destination  operand,  or  a  pointer  to 
the  eventual  destination  operand,  etc.  Hence,  for  any  parti- 
'  u 1  a  r  machine,  there  needs  to  be  i  characterization  of  every 
'perand  involved  in  every  type  of  instruction  in  its  instruction 
set  and  a  corresponding  mechanism  in  the  analysis  facility 
which  permits  one  to  refer  to  each  of  those  operands  through  its 
relation  to  the  instruction.  Facilities  such  as  this  permit 
one,  for  example,  to  say:  (i)  "if  I  ever  multiply  (any 

number)  by  a  negative  number  and  store  the  result  into  X, 
let  me  know  and  stop";  or  (ii)  "If  the  truncation  error 
nvolved  in  an  integer  division,  defined  as  a  b  s ( 1  - ( (destination 
operand*result ) / source  operand))  ever  exceeds  57,  do  ...". 

Let  us  note  an  ambiguity  in  the  former  request(i);  often 
the  result  of  a  computation,  such  as  the  multiplication  in 
this  case,  is  stored  temporarily  in  a  different  location 
han  its  eventual  destination.  I.ater,  perhaps  after  several 
instructions,  it  is  moved  to  its  eventual  destination.  This 
is  particularly  true  about  machines  which  do  not  have  memory- 
to-menory  operations.  In  such  machines,  the  high-speed 
registers  are  used  to  hold  temporary  results  very  often.  In 
manv  cases  a  temporary  result  may  remain  in  a  register  over 
several  instructions.  In  such  a  case,  how  should  the  "store 
into  X"  be  interpreted,  as  an  "immediate  store",  a  "store 
within  a  fixed  number  of  instructions"  or  an  "eventual 
st  re  meaning  a  store  sometime  before  the  computed  value 
is  modified?  The  answer  to  this  question  is  the  same  as 
the  answer  to  earlier  questions  about  interpretations  of 
requests:  namely,  that  it  does  not  matter;  every  interpreta¬ 

tion  should  be  able  to  be  formulated  within  the  analysis 
facility. 

1  would  now  like  to  mention  a  construct  and  an  associated 
notation  first  used  (to  the  best  of  my  knowledge)  by  r  A . R . Ho arc 


13 


»M::1”«,“i;r.1;r™TD,E-:w-  D1Jkstra's  "a  sh°” 

^tr  v:dsicir,  "vix.: 

rhanRinrthrXt’  ^  general  affectln8  values  of  variables,  i.e. 

the,CU:rent  StSte-  Let  Bl.  B  2  ,  ...  stand  for  either 

for  *5  stating  a  relation  between  values  of  variables  or 

for  pieces  of  program  text  evaluating  such  a  predicate,  l.e. 

livering  one  of  the  values  true  or  false  without  further  affec- 
Then  U6S  °f  Varlables’  i.e.  without  changing  the  current  state. 

P 1  C  S  ]  P  2 

means:  The  truth  of  PI  immediately  prior  to  the  execution  of 

S  implies  the  truth  of  P2  immediately  after  that  execution  of 
p, I ‘ ‘  *  .Dijkstra  then  goes  on  to  state  some  theorems  relating 
PI  s  Bi  s  and  S.  The  relation  P1[S]P2  Riven  above  seems  to  be 
another  natural  and  useful  construct  for  debugging  purposes:  it 
is  a  succinct  formulation  of  a  question  about  the  effect  of  a 
piece  of  code  S  on  any  part  of  the  machine  state.  It  is  clearly 

that  such  relations  for  arbitrary  PI,  S  and  P2  be 
testable  easily  within  the  analysis  facility. 

of  i-piied  ^ 

instructio^level”*11^  ^  Path  °f  C°ntr01  d°”n  “  the 

D2-  Determining  the  type  of  instruction  being  executed, 

D3-  Following  arbitrarily  complex  pointer  chains  in  cove, 

*11  ^etermining  the  addresses  and  values  (old  and  new)  of 

all  the  operands  (explicit  and  implicit)  of  an  instruction  as 
well  as  their  relation  to  the  instruction, 

D5-  Keeping  an  arbitrary  number  of  previous  values  of  any 
address,  in  an  easily  accessible  form, 

D6-  Computing  arbitrary  functions  over  the  current  machine 

state y 


D7-  Searching  execution  history  (backwards  and  forwards) 
for  a  state  satisfying  a  u s e r- sp e c i f ied  predicate, 

D8-  Efficient  restoration  of  a  state  found  in  such  a  search, 


14 


D9-  Stopping  and  starting  execution, 

DIO-  Performing  anv  sequence  of  the  operations  D1  through 
D9  at  any  and  each  of:  operand  fetch,  operand  store,  instruction 
fetch  and  instruction  completion  times. 

While  it  is  rather  imprecise  to  talk  about  the  "completeness" 
of  a  debugging  system  (or  of  a  system  with  respect  to  debugging), 
one  can  get  a  certain  amounc  of  reassurance  of  the  sufficiency 
of  these  requirements  for  a  dcougging  facility  by  convincing 
oneself  that  they  offer  a  great  deal  of  help  in  isolating  all 
the  classes  of  bugs  mentioned  in  Chapter  1,  sub-section  1.3.1. 

2 .2  Flow  Analysis 

As  stated  in  Chapter  1,  by  the  "flow  analysis"  of  a  program 
P,  I  shall  mean  inquiries  into  the  relations  between  sequences 
of  machine  states  which  arise  during  a  set,  E(P),  of  executions 
of  P.  It  is  helpful  to  think  of  the  program  counter  as  a  distinct 
entity  from  the  rest  of  the  machine  state.  In  machines  having 
a  built-in  stack,  it  may  also  be  useful  to  think  of  the  stack 
pointer  as  a  third  distinct  entity,  especially  in  high-level 
languages  which  do  not  permit  explicit  access  by  the  user  to  the 
elements  in  the  stack.  I  shall  not  do  so  here,  since  I  shall 
be  mainly  concerned  with  machine  language  programs  where  every¬ 
thing  is  essentially  global. 

Thus,  thinking  of  the  program  counter  (PC)  as  a  seperate 
entity  from  the  rest  of  the  machine  state  (which  I  shall  call 
"memory",  M) ,  we  can  identify  two  types  of  flow:  Control  Flow 

and  Data  Flow,  where  the  former  refers  to  the  sequence  of  values 
assumed  by  the  PC  and  the  latter  to  the  sequence  of  states  of  M. 

T  would  like  to  emphasize  the  word  "sequence"  in  the  last  sentence. 
The  word  "flow"  implies  a  sequence  of  changes  to  one  phenomenon 
relative  to  another.  Hence,  for  example,  we  might  ask:  "Starting 
from  a  particular  state  of  M  and  PC,  what  is  the  kth  value  of 
PC?".  Or  similarly,  "Starting  from  a  particular  state  of  M  and 
PC,  what  is  the  kth  state  of  M?".  Or,  "Given  that  M“M1  ,  when 
PC=P1,  what  is  M  when  PC=P2?". 

2.2.1  Control  Flow 


Normally,  it  suffices  to  consider  only  changes  from  sequential 
flow  in  order  to  be  able  to  reconstruct  the  entire  history  of 
control  flow.  One  must  be  careful,  however,  to  include  enough 
information  to  indicate  when  each  change  occurred.  For  example, 
one  may  include  the  starting  address  of  the  program,  followed 
by  pairs  (a  ,b  ),  i=i,...,n,  where  a  and  b  are  the  origin  and 
i  i  i  i 


15 


the  destination  of  the  ith  branch  instruction,  respectively. 
Alternately,  one  might  let  a  be  the  starting  address  of  a 

i 

block  of  straight-line  code  and  b  the  number  of  instructions 

i 

in  that  block.  A  GPEAF  should  have  facilities  for  sensing  the 
fetch  of  an  instruction  from  a  location  X,  the  completion  of 
its  execution,  reconstructing  the  last  N  branches  (origin  and 
target),  for  arbitrary  N.  It  should  also  be  able  to  execute 
a  user-specified  procedure  before  and  after  any  or  every  instruc¬ 
tion.  (This  ability  was  also  listed  as  a  requirement  under 
debugging.)  It  is  important,  though  this  can  also  be  implemented 
by  the  user  himself  using  the  above  facility,  that  when  the  user 
gains  control  before  or  after  an  instruction,  he  be  able  to 
deter;.. _ne  the  address  of  the  previous  instruction.  F.g.  if 
one  can  Jump  to  an  address  A  from  several  locations,  it  is 
necessary  to  be  able  to  determine  easily  at  A  where  one  came 
from. 

2.2.2  Data  Flow 

If  we  think  of  data  flow  as  a  sequence  of  changes  to  the 
state  of  the  memory  M  representing  the  progress  of  execution, 
it  becomes  clear  that  in  order  to  be  able  to  analyze  it,  we  must 
first  relate  it  to  control  flow.  That  is,  we  must  be  able  to 
determine  which  changes  are  associated  with  which  parts  of  the 
execution  path.  In  general,  many  parts  of  the  execution  path 
may  result  in  an  identical  effect  on  the  state  of  M.  Thus, 
data  flow  analysis  must  be  able  to  determine,  where  possible, 
the  precise  part  responsible  for  any  given  effect. 

Some  Fundamental  Relations  in  Data  Flow 

Let  us  consider  two  contiguous  parts,  A  and  B,  with  B 
temporally  following  A,  in  the  execution  path  of  a  program. 
Suppose  that  we  would  like  to  know  the  data  flow  from  A  to  B. 

More  specifically,  we  would  like  to  know  the  set  of  addresses 
which  are  both  modified  by  A  and  read  by  B  before  being  modified 
again,  and  the  values  of  those  addresses  upon  entry  into  B 
(in  the  absence  of  any  outside  interference,  this  is  equivalent 
to  the  value;  of  those  addresses  upon  exit  from  A).  Let  us 
define  as  the  input-set,  I  ,  of  A  the  set  consisting  of  pairs 

A 

(a  ,v  )  where  a  is  the  ith  unique  address  from  which  A  reads 
i  i  i 

something  before  writing  into  it,  and  v  is  the  value  read. 

i 

Let  us  also  define  as  the  output-set,  0  ,  of  A  the  set  consisting 

A 


16 


of  pairs  (b  ,u  )  where  b  is  the  ith  address  written  by  A 
i  i  i 

and  u  the  contents  of  b  .upon  exit  from  A. 
i  i 


Then,  the  data  flow, 
simply  as: 


D  ,  from  A  to  B  can  be  characterized 

<  A  B  > 


(1)  D  =0  ni 
<  AB  >  A  B 


Note  that  in  computing  this  intersection,  it  suffices  to 
look  at  only  the  address  parts  of  the  elements  of  the  two  sets, 
since,  due  to  the  temporal  adjacency  of  A  and  B,  equality  of 
addresses  will  imply  equality  of  contents.  However,  no  harm 
will  result  if,  in  order  to  maintain  the  conventional  se t -theoret ic 
definition  of  intersection,  we  require  that  only  those  elements 
which  are  identical  in  all  respects  (which  are  used  to  include 
them  in  their  respective  sets  in  the  first  place)  be  included 
in  the  intersection.  Hence,  the  conventional  definition  of 
intersection  in  set  theory  will  suffice  for  the  relation  (1). 

Let  us  now  consider  an  important  step  in  data  flow  analysis, 
namely  the  compaction  of  two  consecutive  parts  into  one.  This 
step  is  fundamental  to  many  types  of  flow  analyses;  see  for 
example  L CO  71].  To  do  this,  we  shall  need  the  following  additional 
notations: 


C(a,t)  =  contents  of  locatio:  a  at  time  t, 

T  (A)  =  time  of  entry  into  part  A, 

e 

<AB>  =  the  part  consisting  of  the  temporal 
juxtaposition  of  parts  A  and  B. 

We  can  now  characterize  the  input  set  of  <AB>  as  follows: 

(2)  I  =  I  (I  -0  ) 

*'AB>  ABA 


Here  again,  it  suffices  to  consider  the  conventional  set- 
theoretic  definition  of  the  union  operation,  since  the  equality 
of  the  address  part  of  an  element  in  I  ,  to  that  of  an  element 

A 

in  (I  -0  ),  implies  the  equality  of  their  value  parts.  This  can 
B  A 

be  briefly  proved  as  follows: 

that  for  some  p=(a  ,v  )and  q=(a  ,v  ), 

P  P  q  q 


Proof:  Suppose 


1  7 


pci  and  q  £ ( I  -0  )  respect  ivelv ,  a  =a  and  v  *v  .  Now,  since 

A  BA  p  q  p  q 

A  and  B  are  consecutive,  v  =  C(a  ,T  (B))=C(a  ,T  (A)),  i.e.  the 

q  q  e  q  x 

contents  of  location  a  are  unchanged  between  the  exit  from 

q 

A  and  entry  into  B.  But  since  a  *a  ,v  must  also  equal 

P  q  q 

C(a  ,T  (A)).  Thus,  for  v  *  v  to  hold,  this  reouires  that  the 

P  x  p  q 

contents  of  address  a  be  modified  during  A.  But  this  contradicts 

P 

our  definition  q  ►■  ( I  -0  ).  Hence  no  such  elements  p  and  q  can 

B  A 


exist  . 


Finally,  we  can  characterize  the  output  set  of  <A,B>  as: 

(3)  0  *0  *  uO 

<AB>  B  A 

where  *u  denotes  an  extension  of  the  union  operation  to  one  which 
"favors"  the  left-hand  operand  over  the  right  hand  one  in  the 
sense  that,  if  there  is  an  element  (a  ,v  )  in  L  and  (a  ,v  ) 

L  L  R  R 

in  R  such  that  a  *a  but  v  *v  ,  then  L*uR  includes  (a  ,v  )  in 

L  R  L  R  L  L 

the  resulting  set. 

This  operation  simply  assures  that  if  some  address  is 
modified  by  both  A  and  B,  then  only  the  final  effect  will  be 
recorded  in  the  output  set  of  <AB>. 

These  three  highly  intuitive  relations  form  a  base  upon 
which  many  data  flow  analysis  mechanisms  can  be  built. 

So  far,  we  have  been  concerned  only  with  consecutive  parts, 
where  we  are  assured  that  nobody  else  will  get  in  between  the 
parts  involved  in  the  data  flow.  But  now  let  us  consider  the 
case  where  the  parts,  A  and  B,  of  the  execution  path,  the  data 
flow  between  which  we  wish  to  explore,  are  not  consecutive. 

What  should  the  analysis  facility  be  able  to  tell  us  about  the 
effects  of  intervening  parts  C  ,  i=l,...,k,  on  the  data  flow 

i 

from  A  to  B?  There  are  at  least  two  reasonable  answers: 

1-  That  the  analysis  system  be  able  to  tell  us  whether  any 
of  the  C  's  had  any  effect  on  the  dat*  flow  from  A  to  B  or  not,  or 


18 


2-  That  the  analysis  system  be  able  to  give  us  a  list 
Of  the  effects,  e.g.  a  list  of  pairs  <(C  ,v  >?<c 

"h"'  ele"on;  °f  the  ^Ir  iodicates  th e  affecting 

part,  and  the  second  element  indicates  the  effect. 

A  little  reflection  shows  that  the  first,  "yes"  or  "no" 
alternative  is  not  satisfactory  unless  there  is  some  LLl 

naiive  Heice  T  ‘ha  ln<: ° r"a '  provided  by  the  second  alter- 

which  the  an.w  !  adopt  the  latter  as  the  information 

wnjch  the  analysis  facility  must  provide. 

What  Maketh  a  Part? 


the  nouoihof^„sar^°?  SC  *”  °f  data  f,OU  ha“  b""  W»M  on 
the  notion  of  parts  In  a  program,  which  could  be  treated  as 

flowed  H°rtr?J  fl°W  3nd  WhlCh  provide  the  link  between  control 
lirr  k  fl°W-  NoW*  let  us  consider  what  properties  a 

part  should  have,  and  how  we  would  recognise  one  if  we  saw  one. 

The  main  motivation  for  the  introduction  of  the  notion  of 
parts  was  to  break  up  the  entire  execution  path  of  a  program 
(with  a  set  of  inputs)  into  more  manageable  units  for  Katherine 
manipulate  and  Interpreting  information  about  data  flow. 

The  smallest  conceivable  unit  for  this  purpose  is  a  memory 

(be,twte"  the  Processor  and  main  memory,  or  the  general 

or  ^m  I  I  ]  ions  of"  e^cutlons  Involving  tens  or  hundreds  of  thousand 
°^  "**)*  % of  raachine  instructions,  this  would  require  the 

recording  of  about  four  or  five  times  that  many  (address  value) 
pairs  This  amount  of  data  is  clearly  too  vol uminoullo  ’  sllre 
in  main  memory.  It  conceivably  could  be  dumped  into  secondary 
storage  periodically  and  searched  as  needed.  However  it  would 

■  i  — v 

» "e"ory  cycIe  *ppears  to  be  *« 

a  tz"1?;:*:;:;:  r ; part  is 

required  Is  of  the  same  order  of  magnitude  ■■  above , °,1 

that  are  not.  ere  Indirectly  address^  ^erlHd^  To tZZj'.E!! 

"temollte^fn  !'!  dev'lol>  *  technique  for  constructing  a 

template  for  each  Instruction  In  the  program,  showing  IS,  mt.Uc 


1  9 


operand  addresses  (all  of  which  may  not  be  apparent  In  the 
U.  ruCtl°„  Itself)  ,  and  rel,le  to  lt  c,ch  “  execution 

In  that  "  '“c,l"n  and  thlJ  dvnanlc  operand  addresses  Involved 
In  that  instance.  Such  a  technique  con  at  best  save  l"^  than 

°*  of  the  storage  required  in  the  preceding  alternative. 

Let  us  now  move  up  one  more  level,  to  the  level  of  groups 

a  Krouv  bV^°n%hJ^e{iTat:  °bvl°US  <uestlon  is  "How  big  should 
j  .  p  ‘  ^  ,e  rpason  that  one  tends  to  raise  the  issue  of 

size  before  other  issues  here,  is  that  so  far  we  have  founS  the 
two  previous  alternatives  unattractive  because  of  the  size  of 
s  orage  required.  Once  we  move  to  the  level  of  group,  of  In.tr,,.  - 
tions  however,  we  have  a  great  deal  of  flexibility.  For  example 

correspond  to°  C°nSlder  3S.a  part’  «rouPs  of  instructions  which  * 
correspond  to  some  syntactic  programming  unit,  such  as  a  sub- 

called  "bn  ?  m  i)  ..°u  We  may  Wigh  t0  consider  ^at  are  usually 
in  t-r  bahlc  blocks  by  compiler  writers,  namely,  blocks  of 

"  trUC  101,5  hfVlng  3  un  i  9 u  e  entry  point  and  a  unique  exit  point. 
(We  must  remember  at  this  point  that,  a  "part"  refers  to  a  part 
of  the  execution  path,  not  of  the  program  text;  i.e.  for  groups 

eroinl trUntl0nS’  3  P3rt  refers  t0  a  Particular  execution  of  one 

shouJd'be  mrfe  fJexib]^  we  ca"  ^t  the  user  define  what 

uld  be  a  part.  This  latter  choice  has  the  advantages  of  control 
1  ng  the  amount  of  storage  required  as  a  function  of  the  length 
execution  as  expected  by  the  user  and  of  having  a  part  corres¬ 
pond  to  a  conceptual  step  in  the  solution  of  the  user’s  problem! 

Given  all  these  alternative  strategies  for  defining  parts 
ofGstrategyaare-  Judgln8  the  sultability  of  a  Particular  choice 

data  flow^questions?65  ^  Ch°Se"  StrategV  perfo*"  answering 
2-  How  practical  is  it  to  implement? 

i  I  In  Chapter  3  I  shall  describe  one  choice  and  discuss  its 
implementation  and  performance. 

Units  of  Data  Flow 

The  most  elementary  unit  of  data  as  represented  in  digital 
computers  is  the  ubiqutious  "bit".  On  the  other  hand,  by  far 
e  largest  fraction  of  processing  is  done  in  terms  of  "words" 
the  size  of  which  varies  from  computer  to  computer.  Further,  a 
fgnificant  amount  of  processing  is  done  in  terms  of  fractions 
words  ,  called^  bytes  ,  and  a  relatively  smaller  portion  in 
erms  o  blocks  of  words.  In  machine  languages,  "blocks"  are 
rarely  used  as  individual  operands  in  an  instruction  (a  notable 


20 


exception  being  the  "transfer  block"  instruction  implemented 
in  certain  machines).  The  "bit"  is  also  very  infrequently 
used  as  an  individual  operand.  Rather,  it  is  usually  employed 
to  express  side-effects  of  certain  operations,  e.g.  the  setting 
of  the  condition  code,  the  bits  in  a  processor  status  word  and 
so  on.  These  side-effects  are  an  essential  part  of  the  effect 
of  an  instruction  and  hence  any  analysis  facility  must  represent 
and  give  access  to  them  in  an  adequate  way. 

The  "bytes"  come  in  two  flavors  (no  pun  intended)  :  fixed 
size  and  variable  size,  fixed  size  being  the  more  commonly  used. 

In  variable-size-byte  machines,  such  as  the  PDP-10,  one  needs 
both  a  starting  position  and  a  length  to  characterize  a  byte 
whereas  with  fixed-size  machines  one  needs  only  the  starting 
position.  Bytes  also  form  an  important  unit  of  data  flow  and 
should  be  dealt  with  in  full  by  a  GPEAF.  For  exqmple,  in  the 
input  and  output  sets  of  a  part,  the  location  (word  address  and 
starting  position  within  the  word),  size  and  contents  of  byte 
operands  should  be  properly  reflected. 

The  "word"  is  probably  the  most  appropriate  unit  for  represen¬ 
ting  the  largest  fraction  of  data  flow.  I  do  not  feel  that  I 
need  to  dwell  on  the  precise  definition  of  a  "word",  since  its 
meaning  for,  probably  all,  commonly  used  machines  today  is  clear. 

An  interesting  class  of  exceptions  to  this  would  be  mac hine s,  p ap er 
or  real,  for  directly  executing  high-level  languages,  such  as  a 
LISP  machine  or  a  SNOBOL  machine.  In  such  machines,  the  selection 
of  the  unit  of  data  flow  probably  ought  to  be  closely  related  to 
the  primitive  data  structures  of  the  language  (e.g.  atoms,  lists, 
strings) . 

Thus,  we  can  conclude  our  discussion  of  appropriate  units 
for  representation  of  data  flow  by  saying: 

1-  The  main  criteria  for  judging  the  suitability  of  a 
proposed  set  of  units  are:  (a)  Is  it  capable  of  representing 
all  elements  of  data  flow?,  and  (b)  How  efficiently,  in  terms 
of  storage  and  interpretation  speed,  does  it  represent  the  great 
majority  of  operations? 

2-  The  choice  of  data  flow  units  has  a  large  impact  on  the 
efficiency  of  the  analysis  facility  and  hence  its  usefulness. 

To  summarize  the  functional  capabilities  required  for  control 
flow  and  data  flow  analysis  tasks,  we  can  list  them  as  follows: 

FI-  Giving  the  control  to  the  user  (or  a  u se r - s pe c  i  f ie d 
analysis  procedure)  before  or  after  every  instruction,  and  before 
or  after  user-specified  instructions, 


21 


hv  m,!2'  Dividing  the  exeucutlon  path  Into  parts  as  specified 

user  an  enabling  the  user  to  refer  to  these  parts  explicitly, 

Constructing  the  input  and  output  sets  of  parts,  as 
these  sets  were  defined  earlier, 

F4-  Determining  the  data  flow  from  a  part  to  the  following 
part  as  per  relation  (1),  8 

F5-  Computing  the  combined  input  and  output  sets  of  adjacent 
parts,  as  per  relations  (2)  and  (3), 

a  +  ft  ~  Determining  the  effects  of  intervening  parts  on  the 
data  flow  between  non-adjacent  parts,  as  discussed  earlier, 

F7-  Enabling  the  user  to  access  every  element  of  any  input 
se  an  any  output  set,  and  use  the  address  and  value  parts  of 
the  element  in  computations. 

2  .3  Performance  Measure m e n t  s 

Performance  measurements  are  concerned  with  relating  the 
resource  requirements  of  a  functioning  unit  to  the  degree  to 
which  it  achieves  its  goals.  For  example,  one  might  relate  the 

Indra?f/  CPL’  recluirerient  of  a  compiler  to  the  compactness 
n  e  fictency  of  the  object  code  it  produces.  A  OPEAF  should 
offer  the  analyst  high  flexibility  in  making  these  measurements. 

We  can  also  talk  about  performance  measurements  of  operating 
nn^GI?S'  P°r  ®XamplG>  scheduling,  storage  allocation  and  paging 
policies  have  become  the  subject  of  much  research  and  analysis 

3  Performance  P°int  of  view.  An  operating  system  can  be 
measured  in  two  ways: 

(i)  We  can  measure  its  component  programs  just  as  we 
measure  user  programs  (i.e.  their  storage  and  CPU  requirements  etc.) 

(ii)  We  can  measure  the  performance  of  the  whole  system 

w  le  it  processes  a  given  workload  (i.e.  in  terms  of  throughput, 
average  response  time  (for  time-sharing  systems),  paging  rate  etc.) 

1  shall  refer  to  the  first  class  of  properties  as  "program 
P  rformance  and  to  the  second  class  as  "system  performance". 

Measurement  and  Modelling  of  Program  Performance 

Among  the  most  frequently  used  measures  of  program  perfor¬ 
mance  are  such  criteria  as  : 


22 


(1)  How  long  It  runs  with  a  certain  input, 

(2)  How  it  spends  its  time, 

(3)  How  much  main  storage  it  requires. 

Measures  (1)  and  (3)  are  generally  provided  by  the  op '.'rating 
system  as  user  accounting  data.  (2)  is  usually  obtained  by 
using  timing  packages  or  explicitly  reading  the  system  clock 
(job-time)  within  the  user  program.  In  either  case,  the  user 
has  to  recompile  his  program  to  vary  the  measurements  of  t  y p  e  ( 2 ) 
h  e  wa  nts  to  make.  Further,  in  multiprogr  a  mm  ed  systems,  the 
CPI  and  storage  charges  for  running  the  same  program  with  the 
same  inputs  can  vary  significantly  as  a  function  of  the  otht' 
programs  running  at  the  same  time.  I  shall  not  go  into  the 
reasons  for^this.  Let  it  suffice  to  say  that  one  often  can  not 
get  a  pure  measure  of  a  program's  running  time  through  conven¬ 
tional  operating  system  facilities.  Hence  it  behooves  an  analysis 
facility  to  offer  much  more  help  in  this  area. 


Another  problem  in  measuring  the  performance  of  existing 
programs  has  been  what  to  do  about  the  parts  which  we  do  not  want 
to  measure  but  which  provide  inputs  to  the  parts  which  we  do 
want  to  measure.  In  large  programs,  program  modification, 
recompilation  and  re-loading  time  and  effort  required  for  each 
hange  one  wants  to  make  to  deal  with  this  problem,  has  often 
made  such  measurements  too  cumbersome  and  time-consuming  to 
undertake  casually.  It  should  be  possible  within  an  analysis 
facility  to  model  or  "dummy  up"  the  logic  and  simulate  the  timing 
of  uninteresting  parts  of  a  program  and  "skip  over"  them,  and 
execute  and  measure  in  detail  the  interesting  parts.  (This 
procedure  is  quite  familiar  to  those  who  have  done  hand -pa t c h ing 
of  the  code  produced  by  a  compiler.)  Using  a  GPEAF ,  one  should 
be  able  to  define  a  number  of  paths  p  ,  i=l,...,k,  through  a 

i 

program  possibly  using  the  part"  definitions  discussed  earlier, 
and  measure  the  time  for  each  complete  traversal  of  each  path. 
This  requires  that  one  be  able  to  sense  departures  from  a  path 
at  some  intermediate  point  in  the  path. 

Another  technique  for  measuring  where  a  program  spends  its 
time  is  periodic  sampling  of  the  program  counter.  This  technique 
has  the  drawback  that  unless  the  period  of  sampling  is  chosen 
with  great  care,  certain  parts  of  the  program  mav  never  appear 
in  the  samnles  because  of  "lock-* tep  "  synchronization  between 
the  sampl  g  and  the  pattern  of  control  flow.  However  this 
problem  c-u  be  overcome  with  a  certain  amount  of  analysis.  This 
technique  has  the  advantage  of  considerably  less  overhead,  com¬ 
pared  with  other  techniques  such  as  timing  each  subroutine  entry 


and  exit.  To  permit  this  technique,  an  analysis  facility  must 
enable  the  user  to  schedule  "sampling  probes"  with  a  dynamically 
controlled  frequency  (to  overcome  the  problem  mentioned  above). 

With  regard  to  measuring  the  storage  requirements  of  a 
program,  since  these  are  strongly  tied  to  the  storage  reference 
patterns,  I  shall  discuss  those  two  subjects  together  in  sub¬ 
section  2.4. 

Measurements  and  Modelling  of  System  Perform anc e 

Under  this  topic  I  shall  consider  the  measurement  of  such 
properties  of  operating  systems  as  system  overhead,  CPU  utili¬ 
zation,  and  paging  rates  (where  applicable)  as  a  function  of 
job  mix  and  system  design.  It  might  he,  reasonably,  felt  that 
we  are  straying  afar  from  our  initially  stated  purpose  of  the 
analysis  of  program  behaviour.  However,  it  r’st  be  pointed  out 
that  the  "true  behaviour"  of  an  operating  system  program  ran 
not  be  studied  without  some  experimentation  involving  the  proces¬ 
sing  of  a  tvpical  job  mix.  It  is  true  that  studies  involving 
the  characteristics  of  an  operating  system  over  several  days  or 
weeks  of  user  time  probably  fall  outside  the  scope  of  an  analysis 
system  of  the  t”pe  envisioned  here,  although  many  functions, 
such  as  measurement-  of  the  average  time  between  interrupts,  the 
storage  '•eference  patterns,  the  average  job  running  time  etc., 
which  can  be  measured  by  a  GPEAF,  could  be  useful  in  such  studies. 

There  is  another,  perhaps  more  interesting,  way  however, 
in  which  a  GPEAF  ought  to  be  useful  in  such  analyses.  This 
approach  involves  the  modelling  of  parts  of  the  user  workload 
and  of  the  operating  system  via  analysis  system  facilities  by 
which  one  can  mimic  the  logic  and/or  the  resource  requirements 
of  these  programs,  as  mentioned  earlier  under  Measurement  and 
Modelling  of  Program  Performance.  An  example  of  such  a  model 
is  a  routine  in  the  language  of  the  GPEAF,  which  simulates  a  user 
job  which  generates  an  I/O  request  every  K  ,  i  *  1  ,  .  .  ,  n  ,  milli- 

i 

seconds  of  CPU  time,  where  each  K  may  be  a  random  number  drawn 

i 

from  a  distrubution.  Another  example  might  be  a  model  of  the 
page  reference  pattern  of  a  job.  One  might  take  an  ensemble  of 
such  models  of  user  jobs  and  model  the  execution  of  those  jobs 
under  a  given  operating  system,  by  invoking  the  facilities  of  a 
GPEAF  to  interface  the  models  with  the  operating  svscetu.  One 
might  even  model  parts  of  the  operating  system  (such  as  I/O 
servicing,  scheduling  etc.)  for  purposes  of  expediency  or  efficiency. 

Let  me  now  summarize  the  capabilities  required  for  the 
performance  measurement  tasks  which  have  been  discussed: 


24 


u  se  r -de  f  inec^paths^in^'a  ““  "O”1"1*  ^  ^“rary. 


speclfic  — ,. 

p  rogram , 


o  r 


are  needed  by  a  user 


Performing  arbitrary  computations  when  control  is  gained 
P4-  Simulating  the  passage  of  arbitrary  lengths  of  time. 


1  L 


.  *4  • 


Storage  Reference  Analysi s 


U,e  pattern  vU^h^r  Is 

pattern,  if  it  can  be  .*5  * 1 1  reference  pages.  This 

I>.'ge.  For  example,  ft  may  be  'us"'  t  o' ““It  fma"  fhow°°f  t  e ^  t  h 

hafbL°„  :ru'f"1„°s5„c1:  nlkCly  £°  tC  3  ''dlrty"  »■«*.  -'it 

has  to  be  £rUte„  out  “aS  br°U8ht  ln'  S°  that  “ 

of  a  pAroPg°raUiari"eea5t£eC  ££mbf  f ‘  e"S  the  h-ork  ing-se  t  sire 
program.  '  b  °f  Unique  Pages  addressed  by  the 

afr  zillion 

of  his  program,  first  he  must  be  able  £o  ob t a  in ^ he  e 8 i ? r J ^ 
pattern  and  determine  the  effects  of  ea^h  change  he 

Set  turfo^t^rf^it1;::81^ np  uay  for  a  >««»■•«  to 
desir^£hf£attyaP:  mLy’"^  “  H  f  “ 

of8‘:  f^ets^fch'L'thfp^^rrt10  ^:dn%^^ra 

into  a  register  with  no  ^tervenfng^  SUCrSSlVe  St°res 

such  a  period,  that  reelct-pr  8  etches  from  it.  During 

values  of  other  variables.  I  n  3  f  a  P  r  S  S  *  v! ly  be  USed  t0  hold  the 
prol  itably  used  to  hold  the  val  °  ’  whenever  a  register  can  be 

(even  if  this  »aj  MjJnl  inJ8r°  r^J6"31  dlfferent  variables 

the  efficiency-cons  L  °d  restorin8  each  such  value), 

0  f  reference  to  each  such  register!  ^  t0  analyze  the  Patten 


25 


There  are  many  other  types  of  analyses  which  might  be 
eallej  storage  reference  analyses"  but  which  I  shall  not 
e  nume  rate. 


The  functional  requirements  for  these  kinds  of  analyses 
can  be  summarized  as: 


SI-  Obtaining  every  address  (including  registers)  generated 
by  the  program,  when  it  is  generated, 

S  2 “  For  each  generated  address,  an  indication  of  whether 
it  is  an  instruction,  an  operand  fetch  or  a  store, 

S3-  Making  arbitrary  computations  whenever  a  generated 
address  and  the  associated  indication  is  obtained. 


2-5.  Summary  of  the  Functional  Requirements 


In  this  section,  I  would  like  to  summarize  the  functional 
capabilities  required  for  the  four  analysis  areas  which  have 
been  discussed.  I  have  no  formal  proof  that  these  capabilities 
form^a  complete"  set;  nor  do  I  pretend  to  know  precisely  what 
the  completeness  of  an  execution  analysis  facility"  may  mean. 
However,  certainly  it  must  mean  "something  more"  than  the  trivial, 
ormal  completeness  in  the  sense  of  being  able  to  compute  all 
computable  functions.  Below,  I  give  my  understanding  r f  what 
that  "something  more"  is. 


We  can  consider  the  required  capabilities  in  four  classes: 

1-  What  information  the  analysis  facility  has  access  to, 

2-  At  what  points  in  the  execution  cycle  it  can  gain  control, 

3-  Its  instruction  set, 

4  External  appearance  and  miscellaneous  useful  features. 


i-Jtii - Information  Requirements  of  the  Analysis  System 


The  Analysis  System  needs  access  to  at  least  two  address 
spaces:  the  address  space  of  the  object  machine  (which  shall 

also  be  called  the  "external  state  of  the  OM")  and  its  cwn  symbol 
space.  (Some  may  want  to  consider  the  former  as  an  element,  e.g. 
a  large  array,  in  the  latter.)  In  particular,  every  address  and 
register  accessible  by  the  object  program  must  be  readable  and 
writable  by  the  Analysis  Facility.  In  fact,  the  access  to  the 
object  machine  address  space  should  be  very  easy  and  direct. 


1  ">'“1 1  "'ll 


26 


If  the  Analysis  System  is  inefficient  in  long  computations 
and  therefore  a  need  for  a  linkage  to  programs  written  in  a 
compiler  level  language  (such  as  the  one  in  which  the  Analysis 
Svstem  may  be  written)  is  Indicated,  then  the  Analysis  System 
routines  should  have  access  to  the  symbol  space  of  that  compiler 
level  language. 

it  is  desirable  that  the  Analyses  System  have  access  to 
the  operand  addresses  and  values  of  the  current  object  instruction 
(which  shall  also  be  called  the  "internal  state  of  the  OM")  , 
without  having  to  decode  them  itself.  Thus,  at  the  end  oi  an 
instruction  cycle,  one  should  be  able  to  say,  in  effect:  "If 
this  is  a  MOVE  instruction  and  the  source  operand  value  is  zero, 
and  the  destination  address  is  between  A  and  B,  then  do...". 

it  is  also  helpful  if  a  direct  indication  of  the  instruction 
class  ( d ou b 1 e -ope  rand ,  s ing 1 e -ope r and  ,  no-operand)  is  available. 

The  luestion  of  access  to  the  timing  of  the  object  machine 
must  also  be  considered.  The  Analysis  System  must  be  able  to 
read  the  clock  of  the  object  machine  or  otherwise  determine  the 
object  machine  time  easily,  at  least  after  each  instruction. 

For  some  applications,  it  may  be  necessary  to  determine  the  object 
machine  time  after  each  major  (primary  memory)  cycle  or  each 
minor  (register  transfer)  cycle. 

While  this  is  not  an  absolute  necessity  (as  we  have  shown 
that  we  can  get  by  without  it  in  the  DAME  system),  it  would  be 
desirable  to  have  access  to  the  user  program  text  and  symbol 
table,  so  that  the  user  could  converse  with  the  system  in  terms 
ot  this  own  symbols. 

It  is  clear  from  the  foregoing  discussions  of  control  flow 
analysis,  that  the  user,  in  cooperation  with  the  system,  will 
define  a  topology  or  structure  over  his  program  for  purposes  of 
control  flow  history.  It  is  also  clear  from  those  discussions 
that  empirical  data  associated  with  each  component  of  that  struc¬ 
ture  will  be  generated  during  the  execution  of  the  user  program 
and  that  this  data  will  be  linked  to  the  appropriate  parts  of 
the  control  flow  history.  Each  of  these  elements  of  information, 
i.e.  user  program  structure,  control  flow  history  and  dynamically- 
generated  empirical  data,  must  also  be  accessible  by  the  user. 

2.5. 2_  Triggering  of  Analysis  Actions 

The  user  must  be  able  to  execute  any  (meaningful)  set  of 
analysis  actions  after  every  operand  fetch,  store,  instruction 
fetcn,  instruction  completion  or  at  specific  points  in  time 
(i.e.  relative  to  object  machine  clock).  Further,  the  user  must 


27 


e  able  to  specify,  optionally,  address  ranges  or  registers 
for  which  the  stated  action  is  applicable.  In  tie  rest  of  the 
thesis,  I  shall  refer  to  a  stated  sequence  of  actions  to  be 
activated  at  one  of  the  above  points  as  a  "hook". 

-2-’.-5  — Die  Instruction  Set  of  the  Analysis  Facility 

The  instruction  set  of  the  Analysis  System  should  contain 
two  classes  of  instructions: 

1-  A  complement  of  instructions  similar  to  those  of  con¬ 
ventional  programming  languages  :  these  will  be  used  to  perform 
assignment,  arithmetic  and  logical  operations,  conditional  execu¬ 
tion,  looping,  subroutine  call  with  parameters  and  I/O.  In 
fact,  this  subset  of  the  instruction  set  should  be  a  programming 
language  which  is  "complete  in  a  practical  sense".  All  the 
computations,  such  as  those  encountered  in  performance  analysis 

or  flow  analysis,  can  be  potentially  done  in  this  subset  of  the 
language . 

However,  as  mentioned  earlier,  in  the  case  that  the  Analysis 
System  instruction  set  turns,  out  to  be  unsuitable  for  long 
computations,  there  should  be  an  escape  mechanism  through  which 
one  can  execute  subroutines  which  are  written  in  a  more  suitable 
language  (possibly  the  one  in  waich  the  Analysis  System  itself 
is  written).  If  that  language  has  a  syntactic  construct,  similar 
to  a  function  in  some  languages,  which  returns  a  value,  then 
it  should  be  possible  to  assign  the  value  returned  by  such  a 
construct  to  a  symbol  in  the  symbol  space  of  the  Analysis  System. 

2  A  complement  of  instructions  particularly  useful  in 
monitoring  and  execution  analysis.  These  should  include  the 
following  operations: 

(i)  insetting,  deleting,  enabling  or  disabling  hooks 
statically  and  dynamically, 

(ii/  defining  "parts"  in  the  execution  path  whose  input 
and  output  sets  (discussed  earlier  under  Data  Flow,  in  sub-section 
2.2.2)  are  to  be  determined  automatically  and  made  accessible 
to  the  user, 

(iii)  Searching  the  input  and  output  sets  of  previous  parts 
for  one  which  satisfies  a  user-specified  predicate  (better  yet, 
raking  each  set  available  to  the  user  in  some  systematic  manner, 
e.g.  reverse  chronological  order,  letting  the  user  perform 
arbitrary  computations  using  the  elements,  i.e.  <address,  value 
pairs,  in  the  set,  and  tell  the  system  whether  he  wants  to 
continue  the  search  or  not), 


28 


(iv)  Displaying  input/output  sets  in  an  appropriate 
tormat  (i.e.  indicating  relations  between  addresses  and  values, 
and  the  "byte"  position  and  size  where  applicable), 

>v)  Backtracking,  to  the  beginning  or  end  of  a  part  found 
xn  a  search  or  specified  explicitly  by  the  user, 

(vi)  Moving  further  back  or  moving  forward  following  the 
execution  of  some  instructions  from  a  "backtracked"  position, 

(vii)  Resuming  execution  from  the  point  where  the  backtrack 
instruction  was  issued. 

2.5.4  External  Appearance  and  Miscellaneous  Useful  Fe  a  t  u  r  e  s 

Since  the  main  design  goal  for  the  Analysis  Facility  is  to 
facilitate  the  performance  of  analyses  of  program  behaviour, 
clearly  the  associated  command  language  should  be  easy  to  use 
and  have  good  e r r o r -d e t ec t i o n  features.  It  must  be  noted  that 
unreasonable-looking  results  obtained  by  some  anal v sis  procedure 
are,  in  some  sense,  "doubly  hard"  to  disprove  or  verify,  since 
one  may  have  to  re-examine  both  the  analysis  procedure  and  the 
process  under  analysis  to  determine  the  validity  of  the  obtained 
result.  Further,  one  frequently  has  to  compose  analysis  proce¬ 
dures  in  a  short  period  of  time,  often  in  an  interactive,  spon¬ 
taneous  fashion,  a  condition  which  increases  the  probability  of 
making  errors. 

All  of  these  conditions  point  to  the  requirement  that  the 
language  of  the  Analysis  Facility  be  simple  and  terse  in  syntax, 
encourage  structured  programming,  and  not  rely  very  heavily  on 
remembering  many  keywords.  This  last  requirement  is  probably 
the  most  difficult  to  achieve  due  to  the  variety  of  specialized 
functions  which  have  to  be  performed  in  collecting  and  searching 
execution  history  data.  Further,  the  objectives  or  a  powerful 
language  and  simplicity  of  syntax  conf.  ct  with  the  objective 
of  not  relying  heavily  on  remembering  n  my  keywords.  However  it 
has  been  shown  that  very  good  compromises  can  be  reached;  witness 
APL  and  LISP. 

These  conditions  also  mean  that  often  the  same  commands 
with  minor  modifications  will  be  entered  repeatedly,  increasing 
the  possibility  of  making  typing  errors  each  time  they  are  entered. 
Hence,  the  analysis  facility  should  possess  a  "library"  capability 
where  frequently  used  command  sequences  can  be  stored  and  called 
when  needed.  Also,  a  good  aditing  facility  for  editing  both 
"on-line",  i.e.  loaded  command  sequences,  and  "off-line",  i.e. 
text  files,  is  extremely  useful. 


\ 


29 


In  referencing  object  machine  instructions,  e.g.  tracing 
them  as  they  are  executed,  or  displaying  a  block  of  instructions, 
the  analysis  facility  should  be  able  to  deal  with  symbolic  forms 
as  well  as  numerical.  This  ability  is  available  in  most  on-line 
debugging  systems  today. 

It  must  be  noted  here  that  the  mental  picture,  on  which 
these  functional  requirements  are  based,  is  that  of  an  inter¬ 
active,  on-line  analysis  facility.  For  batch  systems,  some  of 
the  requirements  become  more  severe,  and  some  less.  For  example, 
in  an  on-line  system,  the  user  may  display  the  values  of  some 
variables  and  base  his  next  action  on  what  he  sees.  In  a  batch 
system,  this  is  not  possible.  Hence,  the  user  would  like  to  do 
the  next  best  thing  -  namely,  program  the  reasoning  process 
he  uses  into  the  analysis  procedure.  (To  the  extent  that  this 
process  can  be  mechanized,  this  is  even  preferable  to  the  visual 
examination  by  the  user.)  This  means  that  especially  in  a  batch 
sytem,  anything  that  the  user  would  like  to  see  displayed  in  an 
interactive  system  should  be  available  "inside  the  machine"  to 
analysis  procedures.  On  the  other  hand,  the  "terseness"  require¬ 
ment  for  the  analysis  language  is  not  as  severe  in  a  batch  system 
as  in  an  interactive  system. 


30 


CHAPTER  3 

THE  DAME  SYSTEM 


In  this  chapter,  I  shall  describe  the  design  of  the  DAME 
(Dynamic;  Analysis  and  Modelling  Environment)  system,  which  has 
been  the  major  vehicle  in  my  research  for  implementing,  expe¬ 
rimenting  with  and  evaluating  new  ideas.  DAME  is  a  facility 
for  studying  the  logical  behaviour  and  the  performance  of  programs 
for  the  PDP-11/20.  It  consists  of  a  PDP-11/20  simulator  and 
a  programmable  analysis  facility  which  achieves  most  of  the 
requirements  set  forth  in  the  last  chapter.  The  main  goal  In 
the  design  of  DAME  was  to  isolate  critical  problem  areas  in  the 
design  of  a  general-purpose  execution  analysis  facility  (GPEAF), 
for  which  solutions  had  not  been  developed  as  yet  and  to  propose 
solutions  to,  at  least  some  of,  these  problems.  It  was  not  the 
intention  to  develop  a  finished,  tuned-up  utility  system  for 
general  use.  Hence,  some  features  for  which  satisfactory  tech¬ 
niques  were  already  known  and  which  would  be  very  desirable  in  a 
system  for  general  use,  were  omitted  from  DAME  since  the  effort 
required  to  implement  them  did  not  seem  justified  in  view  of  their 
minimal  contribution  to  the  research  aspect  of  this  project. 

However,  despite  such  omissions,  I  have  found  DAME  to  be  a  powerful 
tool  for  analyzing  program  behaviour. 

In  order  to  facilitate  the  reading  of  this  chapter  by  readers 
with  different  objectives,  I  shall  first  provide  a  detailed  outline. 
This  outline  can  also  be  used  as  a  reference  later  to  quickly 
locate  the  section  about  a  particular  point,  as  well  as  to  guide 
the  reader  in  the  first  reading  to  sections  of  more  interest  to 
him. 


Outline  of  Chapter _ 3 

The  first  topic  is  the  set  of  data  structures  underlying  the 
design  of  DAME.  In  Section  3.1,  I  first  summarize  these  structures 
and  then  discuss  in  more  detail  some  of  them,  namely,  the  formats 
of  objects  and  lists  as  well  as  certain  master  lists  and  symbol 
tables  which  play  an  essential  role  in  the  implementation  of  DAME. 


The  description  of  these  structures  is  provided  only  because 
they  facilitate  certain  search  operations  over  pre-defined  classes 
of  objects.  An  understanding  of  them  is  not  required  for  an 
overall  understanding  of  DAME. 


30 


CHAPTER  3 

THE  DAME  SYSTEM 


In  this  chapter,  I  shall  describe  the  design  of  the  DAME 
(Dynamic  Analysis  and  Modelling  Environment)  system,  which  has 
been  tie  major  vehicle  in  my  research  for  implementing,  expe¬ 
rimenting  with  and  evaluating  new  ideas.  DAME  is  a  facility 
for  studying  the  logical  behaviour  and  the  performance  of  programs 
for  the  PDP-11/20.  It  consists  of  a  PDI-11/20  simulator  and 
a  programmable  analysis  facility  which  achieves  most  of  the 
requirements  set  forth  in  the  last  chapter.  The  main  goal  in 
the  design  of  DAME  was  to  isolate  critical  problem  areas  in  the 
design  of  a  general-purpose  execution  analysis  facility  (GPEAF)  , 
for  which  solutions  had  not  been  developed  as  yet  and  to  propose 
solutions  to,  at  least  some  of,  these  problems.  It  was  not  the 
intention  to  develop  a  finished,  tuned-up  utility  system  for 
general  use.  Hence,  some  features  for  which  satisfactory  tech¬ 
niques  were  already  known  and  which  would  be  very  desirable  in  a 
system  for  general  use,  were  omitted  from  DAMF  since  the  effort 
required  to  implement  them  did  not  seem  justified  in  view  of  their 
minimal  contribution  to  the  research  aspect  of  this  project. 

However,  despite  such  omissions,  I  have  found  DAME  to  be  a  powerful 
tool  for  analyzing  program  behaviour. 

In  order  to  facilitate  the  reading  of  this  chapter  by  readers 
with  different  objectives,  I  shall  first  provide  a  detailed  outline. 
This  outline  can  also  be  used  as  a  reference  later  to  quickly 
locate  the  section  about  a  particular  point,  as  well  as  to  guide 
the  reader  in  the  first  reading  to  sections  of  more  interest  to 
him. 


Outline  of  Chapter _ 3 

The  first  topic  is  the  set  of  data  structures  underlying  the 
design  of  DAME.  In  Section  3.1,  I  first  summarize  these  structures 
and  then  discuss  in  more  detail  some  of  them,  namely,  the  formats 
of  objects  and  lists  as  well  as  certain  master  lists  and  svmbol 
taoles  which  play  an  essential  role  in  the  implementation  of  DAME . 

The  description  of  these  structures  is  provided  only  because 
they  facilitate  certain  search  operations  over  pre-defined  classes 
of  objects.  An  understanding  of  them  is  not  required  for  an 
overall  understanding  of  DAME. 


*"3-;.“*°" - il»  another  data  structure,  the  representation 

of  the  PDP-11  core,  is  described.  In  this  connection,  I  also 
present  the  general  problem  of  representing  the  memory  of  one 
computer  in  another,  emphasizing  the  problems  related  to  the 
respective  memory  sizes  and  word  lengths  of  the  two  machines. 

.  .  ]n  --e-ction  3  •  3-»  the  question  of  the  "time-grain"  of  simula¬ 
tion  is  cons  idered  .  In  particular,  the  costs  and  benefits  of 

simulation  at  the  memory  cycle  level  and  at  the  instruction  level 
are  briefly  discussed  and  compared.  (Note:  This  topic  is  dis¬ 
cussed  In  more  detail  in  Chapter  5.) 

c  ,  I.n  -S  e--t  1 0  n  3 : 4  ,  the  hook  mechanism  is  described.  The  types 
Of  hooks  and  the  points  in  the  PDP-11  instruction  cycle  at  which 
they  may  be  placed  are  explained.  Whenever  a  hook  is  activated, 
the  PDP-11  simulator  makes  available  to  the  user  certain  informa- 
on  on  the  current  state  of  the  processor  and  the  Unibus  bv  storing 
that  information  in  PDP-10  global  symbols.  In  this  section',  a 
list  of  the  PDP-10  global  symbols  used  for  this  purpose  is  given. 

v  .  *n  — 1 0  n  3  •  5’  the  most  significant  feature  of  DAME,  the 

Node  Mechanism,  is  described.  This  mechanism  permits  a  guaranteed 
backtrack  capability  to  any  point  in  the  execution  history  and 
an  analysis  of  data  flow  in  terms  of  user-defined  nodes.  The 
user  thus  has  almost  complete  control  over  the  amount  of  execution 
history  information  collected  by  the  system. 

.  ,  An  understanding  of  the  Hook  Mechanism  (Section  3.4)  and  of 

U,.e  Node  M^ha„ls»  (Section  3.5)  is  essential  to  t  hr-^a?7I-t7^di  ne 
Of  the  rest  of  the  thesis. - “• 

In  Section  3^,  an  outline  of  the  DAME  instruction  set  is 
given.  First  the  general  syntax  of  DAME  instructions  is  specified. 

ie  instruction  set  is  divided  into  two  subsets.  The  first  sub¬ 
set  (Section  3.6.1)  contains  the  instructions  provided  for  normal 
programming  operations  such  as  assignment,  arithmetic,  looping 
and  the  like.  These  instructions  are  listed  without  much  explana¬ 
tion,  except  for  several  instructions  which  are  more  uncommon 
(e.g  a  search-list  instruction).  The  latter  are  explained  in 
detail.  The  second  subset  of  instructions  (Section  3.6.2)  consists 
of  those  which  are  specifically  designed  for  monitoring  the  execu- 
t  on  of  the  -11,  collecting  data  and  searching  them.  These  are 
also  explained  individually.  An  understanding  of  this  section 
shoul  be  sufficient  to  follow  the  detailed  illustrations  given 
in  the  next  chapter.  However,  for  those  who  wish  a  more  detailed 
and  systematic  description  of  the  instruction  set,  a  user's  manual 
is  provided  in  Appendix  A  of  the  thesis. 


In  the  final  section,  Section  3.7 ,  some  unimplemented  ideas 


32 


for  improving  the  performance  of  DAME  are  discussed.  They  are 
immediately  imp  1  erne n t ab 1 e ,  as  opposed  to  future  research  ,  ideas. 

3_^  1  The  Underlying  Data  Structures 

In  DAME,  one  has  access  to  three  address  spaces: 

1-  Objects  and  list  structures,  which  are  the  main  class 
of  entities  that  DAME  deals  with, 

2-  PDP-11  core,  general  and  device  registers, 

3-  Global  PDP-10  symbols  used  in  the  simulator. 

(Note:  These  three  address  spaces  are  not  disjoint;  the 

PDP-11  registers  and  core  are  als1'  a'cessible  as  PDP-10  global 
symbols.  Some  operations  which  norma  ly  operate  on  objects  can 
also  operate  on  -10  globals.) 

In  the  rest  of  this  section  I  shall  describe  the  structure 
and  possible  attributes  of  objects,  and  several  global,  pre-defined 
list  structures  which  are  crucial  to  the  implementation  of  DAME. 

The  other  two  address  spaces  will  be  discussed  in  succeeding 
sections.  Tie  reader  who  is  not  concerned  with  the  implementation, 
can  skip  to  Section  3.2  without  loss  of  continuity. 

Most  of  the  information  structures  generated  by  DAME  during 
the  execution  of  an  -11  program  are  in  the  form  of  lists,  as 
are  DAME  routines  themselves  and  most  of  the  pre-defined  infor¬ 
mation  in  the  environment.  The  basic  list-processing  functions 
and  the  PDP-11  simulator  are  implemented  via  the  BLISS-based 
general-purpose  simulation  package  POOMAS ,  developed  by  Amund 
Lunde  [ LU  71]. 

Attributes  of  DAME  Ob j  e  c  t  s 

Each  DAME  object  has  the  following  attributes;  a  "successor", 
a  "predecessor",  a  "size",  a  "class",  a  "subclass"  and  possibly 
a  list  of  "secondary  attributes".  (The  first  four  attributes 
are  provided  by  POOMAS.)  Objects  which  are  not  members  of  any 
list  contain  a  special  code,  NONE,  as  their  successor  and  predecessor 
attributes. 

All  of  the  above  attributes  of  an  object,  except  the  secondary- 
attributes,  are  represented  in  three  "system  words"  preceding  the 
first  "user  word"  of  the  object.  Objects  are  addressed  by  their 
first  user  word,  called  "word  0".  The  svstem  words  are  also 
called  "word  -1",  "word  -2"  and  "word  -3".  The  standard  object 
format  is  shown  in  the  next  figure. 


34 


The  "class"  attribute  is  used  mostly  by  POOMAS  (e.g.  to 
ldentifv  list  heads,  objects  representing  process  pointers  and 
event  notices  on  the  simulation  event  calendar).  DAME  also 
uses  the  class  attribute  to  indicate  what  is  usually  called 
"data  types"  in  programming  languages,  namely  whether  some  data 
is  normallv  to  be  treated  as  a  character  constant,  character 
variable,  numeric  constant,  numeric  variable,  etc. 

In  addition  to  the  class  attribute,  DAME  objects  have  a 
"subclass"  attribute.  The  subclass  attribute  designates  the 
general  function  of  an  object,  e.g.  DAME  instruction  subclass, 
hook  subclass,  node  subclass,  input-set  subclass,  output-set 
subclass . 

The  "secondary  attributes"  are  those  attributes  which  mav 
be  defined  for  some  objects  but  not  necessarily  all.  The  secondary 
attributes  of  an  object  are  themselves  represented  as  objects 
and  are  put  on  that  object's  Secondary  Attribute  List  (SAL) . 

The  SAL  also  serves  as  a  convenient  place  for  the  user  to  save 
arbitrary  information  which  is  to  be  associated  with  an  object 
but  which  can  not  be  a  part  of  its  contents.  For  example,  suppose 
a  user  would  like  to  record  the  contents  of  some  core  locations 
whenever  a  certain  node  is  entered.  He  can  do  that  bv  creating 
an  object  of  a  particular  subclass  every  time  the  node  is  entered, 
copying  the  contents  of  the  locations  he  is  interested  in  into 
the  object  and  putting  it  on  the  node -ob j ec t 1 s  SAL.  He  can  later 
retrieve  that  information  by  a  special  DAME-supplied  function  by 
giving  the  subclass  of  the  second ary- attribute-object. 

Subclass  Master  Lists 

In  order  to  provide  access  to  objects  via  their  subclass 
(i.e.  their  general  function)  there  is  a  master  list  for  each 
subclass,  v.'hich  contains  a  pointer  to  every  object  of  that  sub¬ 
class.  Thus,  for  example,  it  is  possible  to  search  the  set  of 
all  node-objects  or  hook-objects  for  one  satisfying  a  particular 
condition,  or  to  delete  all  the  DAME  routines  defined  so  far  etc. 

In  particular,  there  is  a  subclass  called  "s u be  1  as »-ma s t e r  sub¬ 
class",  which  contains  all  these  subclass  master  lists.  Most 
of  the  objects  existing  at  any  given  point  in  time,  can  be  accessed, 
without  knowing  their  name  or  address,  through  these  master  lists. 

S  vnbol  Tab  1 e  s 

In  addition  to  the  subclass  masters,  there  is  a  conventional 
symbol  table  maintained  by  DAME,  which  permits  access  to  the 
objects  by  their  names.  The  Symbol  Table  is  also  organized  as  a 
list  and  can  be  searched  by  the  usual  list  processing  functions. 
Since  the  user  can  refer  to  global  PDP-10  symbols,  the  DDT  symbol 


table  is  also  present  during  execution.  (A  list  of  some  useful 
symbols  is  given  in  the  User  Manual  in  Appendix  A).  In  trans¬ 
lating  a  DAME  instruction,  if  a  name  can  not  be  found  in  the  DAME 
symbol  table,  then  the  DDT  symbol  table  is  searched.  (These 
symbol  tables  are  not  to  be  confused  with  that  used  by  the  PDP-11 
assembler  for  PDP-11  symbols.  The  latter  is  not  saved  by  the 
assembler  after  assembly  and  is  not  available  to  DAME.) 

3.2  The  Representation  of  One  Main  Memory  Inside  Another 


In  the  next  sub-section,  3.2.1,  a  discussion  of  the  general 
problem  of  representing  one  main  memory  inside  another  is  presen¬ 
ted.  Readers  interested  only  in  the  approach  taken  in  DAME, 
may  skip  to  the  following  sub-section,  3.2.2,  without  loss  of 
continuity. 

3.2.1  The  General  Problem 

One  of  the  basic  representational  issues  in  simulating  one 
computer  inside  another  is  the  representation  of  the  main  memory 
of  the  simulated  machine  (called  the  Object  Machine  or  OM)  in  the 
simulating  machine  (called  the  Host  Machine  or  HM).  The  importance 
of  this  issue  arises  from  the  fact  that  it  may  have  a  big  impact 
on  the  storage  requirements  as  well  as  the  running  speed  of  the 
simulation.  (In  this  discussion,  I  shall  limit  myself  to  word- 
oriented  machines,  i.e,  those  in  which  the  greatest  hulk  of 
memory  accesses  address  words,  as  opposed  to  bits,  bytes  or 
variable-! ength  blocks.)  Two  major  components  of  this  issue  are: 
(i)  The  relative  word  lengths,  ( i i )  The  relative  sizes  of 
directly  addressible  memory  in  the  two  machine. 

Let  us  denote  by  W  and  W  the  words  lengths,  and  by  M  and 

OH  0 

M  the  sizes  in  words  of  the  object  and  host  machine  memories, 

H 

respectively.  (To  be  more  precise,  M  is  the  size  of  the  portion 

H 

of  the  HM  memory  which  mav  be  used  to  represent  the  OM  memory.) 

In  the  usual,  and  most  comfortable,  case  W  >W  and  M  >M  .  This 

HO  HO 

permits  an  explicit  and  direct  representation  of  each  word  of  the 
OM  in  the  HM.  If  W  S2W  ,  then  the  issue  of  packing  more  than  one 

H  0 

OM  word  into  one  HM  word  comes  up.  Clearly,  if  M  is  much  smaller 

0 

than  M  ,  and  main  memorv  cost  is  not  a  problem,  or,  alternately, 

H 

if  the  HM  has  no,  or  very  inefficient,  instructions  for  extracting 
a  field  out  of  an  HM  word  which  could  represent  one  OM  word,  then 


36 


the  odas  are  heavily  weighted  in  favor  of  mapping  one  OM  word 
to  one  i'M  word.  It  must  also  be  noted  that  the  increased  size 
ot  storage  required  to  represent  the  OM  memory  can  also  degrade 
the  running  speed  of  the  simulation  in  a  time-sharing  environment 
by  increasing  the  page-fault  rate  or  by  causing  delays  in  being 
swapped  in  bv  the  operating  system. 

If  M  ^  M  ,  one  can  use  a  "paged"  simulated  memory  technique, 

H  0 

bv  dividing  the  OM  memory  into  pages  and  reading  and  writing 
pages  as  required  from  a  "paging  disk"  or  drum.  All  the  techniques 
which  have  been  brought  to  bear  to  improve  the  performance  of 
paged  systems  then  become  applicable  to  such  a  system.  If  it 
turns  out  that  the  "working-set  "  of  the  program  under  analysis 
is  smaller  than  M  ,  then  the  performance  of  this  system  approaches 

H 

that  of  one  where  M  M  . 

0  H 

If  W  h  <  then  more  than  one  word  of  MM  are  needed  to  represent 
H  0 

one  word  of  OM .  In  this  case,  the  layout  of  the  OM  work  must  be 
designed  to  minimize  the  overhead  of  decoding  OM  instruction  operands, 
and  anv  tag  bits  used  by  the  Hook  Mechanism  as  discussed  in 
Section  3.4  and  Chapter  5. 

3.2.2  The  Representation  of  the  PDP-11  in  the  PDP-10 

In  the  case  of  representing  the  16-bit  28K  PDP-11  in  the 
36-bit  PDP-10  with  up  to  192K  core,  initially  two  -11  words  were 
packed  into  one  -10  word.  However,  this  approach  was  later  aban¬ 
doned  in  favor  using  one  -10  word  per  —11  word  and  utilizing  18 
of  the  remaining  bits  in  the  word  to  address  a  list  of  DAME  objects 
associated  with  that  -11  location.  This  list,  called  the  Associa- 
tion  List  (AL)  of  tha  location,  contains,  for  example,  the  hook- 
objects  associated  wi.h  that  location,  if  any.  It  is  also  acces¬ 
sible  by  the  user  and  may  be  used  to  save  arbitrary  information. 

However,  normally  only  a  small  fraction  of  core  locations  have  a 
non-empty  AL .  Therefore,  this  design  decision  may  be  considered 
wasteful  of  core.  Nonetheless,  as  will  be  seen  later,  in  heavily 
monitored  programs,  these  lists  permit  much  faster  access  to  the 
monitoring  actions  associated  with  a  particular  location.  Thus, 
in  DAME,  the  low-order  16  bitsof  the  -10  word  are  used  to  represent 
-11  words,  the  high-order  18  bits  point  to  the  AL,  and  the  remaining 
two  bits  are  used  in  the  maintenance  of  input/output  sets  (Section  3.5). 

Only  the  existing  device  registers  in  the  peripheral  bank 
are  defined;  attempts  to  access  undefined  locations  will  result 
in  a  "time-out  error"  on  the  Unibus  and  an  error  trap  will  occur. 


All  error  conditions  are  handled  just  as  they  are  specified 
in  the  PDP-11/20  Processor  Handbook.  The  only  supported  1/0 
device  at  present  is  the  TTY.  (Recently,  6  relocation  registers 
were  added  to  handle  C.mmp  programs  [WIT  72].) 

3 . 3  The  Time-Grain  of  Simulation 

This  issue  has  at  least  as  strong  an  impact  on  the  running 
speed  of  the  simulation  as  the  representation  of  the  0M  memory. 

The  factor  which  has  the  major  influence  on  the  selection  of  the 
time-grain  is,  clearly,  the  degree  of  precision  with  which  one 
wants  to  simulate  the  operation  of  the  hardware.  It  has  already 
been  indicated  in  Chapter  2  that  this  should  be,  at  least,  at 
the  level  of  individual  instructions.  Thus,  for  example,  one 
wouid  be  guaranteed  that  after  each  instruction,  the  state  of 
the  memory  and  the  value  of  the  simulation  clock  would  be  correct 
(within  the  tolerances  given  in  the  hardware  specifications  on 
which  the  simulator  is  based). 

The  next  lower  level  In  the  grain  of  simulation  is  the  "fetch 
instruction-fetch  operands-execute"  level,  which  I  shall  call 
"memory  cycle  level".  This  level  involves,  on  the  average,  about 
3  to  5  times  as  many  events  as  the  instruction  level.  If  the 
0M  permits  intra-instruction  interrupts,  e.g.  after  each  memory 
cycle,  and  if  one  wants  to  reflect  the  timing  of  these  interrupts 
precisely,  then,  clearly,  one  has  to  design  the  simulation  at 
this  level. 

Due  to  the  existence  of  the  so-called  "non-processor  request" 
(NPR)  interrupts  on  the  PDP-11,  although  at  present  no  device 
which  can  generate  NPR  interrupts  is  supported,  the  simulation 
has  been  designed  at  the  memory  cycle  level.  This  design  decision 
was  also  influenced  by  a  desire  to  permit  studies  at  the  processor- 
Unibus  level.  The  overhead  introduced  by  simulating  at  this 
level,  as  opposed  to  Instruction  level,  is  studied  in  Chapter  5. 

3  .  4  The  Hook  Mechanism 

The  principal  mechanism  by  which  the  user  causes  DAME  to  take 
some  action  while  his  program  is  running,  is  the  Hook  Mechanism. 

A  hook  is  an  object  having  two  user  words;  the  first  contains 
a  hook  type,  and  the  second  a  pointer  to  the  list  of  DAME  actions 
to  be  taken  when  the  hook  is  triggered.  Hooks  nay  be  created, 
deleted,  enabled  or  disabled  dynamically  by  the  HOOK  command 
explained  in  section  3.6. 

There  are  two  categories  of  hooks:  general  hooks  and  addressed 
hooks.  Within  each  category,  there  are  several  types.  General 
hooks  are  those  in  which  a  user-specified  DAME  action  will  be  taken, 


18 


depending  on  its  type,  at  one  of  the  following  points 


1- 

After 

2  _ 

Before 

3- 

After 

4- 

After 

5- 

After 

6- 

After 

7- 

After 

Addressed 

-  ^  ^  w  t.  w.  *  iiu  w  iv  li  u  ii  j.  y  Ail  u  II  a  L 

thev  are  applicable  only  when  the  specified  operation  (e.g. 
fetch,  store)  is  performed  on  an  address  in  a  specified  range. 
The  types  of  addressed  hooks  are: 


8-  After  every  fetch  from  an  address  in  a  given  range 
(type  AF)  or, 

9-  Before  every  store  into  an  address  in  a  given  range 
(type  AS)  or, 

10-  After  every  instruction  fetched  from  a  given  address 
range  (type  A I F )  or, 

The  completion  of  every  instruction  fetched  from  an 
address  range  (type  AIC). 

To  insert  a  hook,  the  user  issues  a  HOOK  command  specifying 
the  hook  tvpe,  the  action  to  be  taken,  and  if  an  addressed  hook, 
the  address  range  to  which  the  hook  is  to  be  applicable.  He  can 
as  many  of  any  tvpe  of  hook  as  he  desires.  Any  DAME  instruction 
can  be  used  in  these  routines. 


The  types  of  hooks  available  in  the  DAME  system,  combined 
with  the  PPOBE  command  which  permits  the  activation  of  a  DAME 
routine  at  a  specific  time  on  the  simulation  clock,  satisfy  the 
requirements  listed  in  sub-section  2.5.2,  "Triggering  of  Analysis 
Actions". 


S oni e  Information  Made  Available  by  the  Simulator 

Whenever  a  hook  is  activated,  the  PDP-11  simulator  makes 
available  to  the  user  certain  information  about  the  state  of  the 


s  e 


PDP-11  CPU,  by  storing  this  information  into  global  PDP-10 
symbols.  This  includes:  (i)  The  address  and  data  associated 
with  the  machi  le  cycle  which  activated  the  hook,  (ii)  The 
operand  registers  and  modes  of  the  current  instruction,  (iii) 
Contents  of  the  DATA,  ADDR  and  CONT  lines  of  the  Unibus,  (iv) 

The  simulation  clock,  (v)  The  addresses  of  the  current  node 
object,  input  set  and  output  set. 

The  data  structures  described  in  Sections  3.1,  3.2  and  the 
above  data  elements,  together  with  the  Execute  External  (XX) 
and  Evaluate  (EVAL)  instructions  for  calling  BLISS-10  routines 
described  in  Section  3.6.1,  satisfy  the  list  of  requirements  in 
Section  2.5.1,  "Information  Requirements  of  the  Analysis  System". 

3.5  The  Node  Mechanism 

A  second  major  mechanism  by  which  the  user  causes  DAME  to 
collect  information  about  the  behaviour  of  his  program,  is  the 
so-called  "Node  Mechanism".  The  Node  Mechanism  provides  a  means 
by  which  the  user  can  breakdown  all  or  a  part  of  a  program  into 
blocks  (called  "nodes"),  such  that  each  execution  of  a  node  (called 
a  "node  instance")  can  be  considered  as  a  unit  in  recording  the 
history  of  execution  of  that  program.  Recalling  our  requirements 
about  determining  data  flow  among  node  instances  and  that  any 
part  of  the  execution  must  be  r e con s t r uc t i b 1 e  from  the  recorded 
execution  history,  it  is  clear  that  we  can  use  the  node  concept 
to  effect  that  reconstruction  by  recreating  each  instance  of 
each  node.  To  recreate  a  particular  instance  X  of  a  node  X,  we 

i 

need  to  know  all  the  inputs  into  X  .  Hence,  for  this  purpose  it 

i 

suffices  to  record  each  address  from  which  X  read  something 

i 

before  modifying  its  contents,  and  the  value  read.  Let  us  denote 
the  set  of  such  (address,  value)  pairs  associated  with  a  node 
instance  the  "input-set"  of  that  instance.  It  is  easy  to  see 
how  one  can  back  up  arbitrarily  fax  in  execution  history  by 
restoring  the  input  sets  of  node  instances  in  reverse  chronological 
order  starting  with  the  current  node  instance.  (Note:  to 
simplify  references  to  node  instances  when  the  identify  of  the 
node  itself  is  not  needed,  I  shall  refer  to  a  node  instance  by 
its  "index"  in  a  particular  execution,  so  that  node  instance  n 
will  refer  to  the  nth  node  instance  since  the  start  of  the  execu¬ 
tion.)  We  must  note  here  that  restoring  the  input  sets  of  node 
instances  k-p ,  ( k -p )  + 1 , . .  . , k  ,  where  k  is  the  current  node  instance, 

does  not  mean  that  we  are  restoring  the  entire  machine  state 
which  existed  when  node  instance  k-p  was  entered;  we  are  only 
restoring  that  part  of  the  machine  state  which  will  guarantee  an 
identical  replication  of  the  instances  k-p  through  k  Recalling 


40 


another  functional  requirement  that  we  must  be  able  to  reconstruct 
every  past  machine  state,  we  realize  that  we  must  also  record 
the  effect  of  each  node  instance  on  the  machine  state.  Such 
an  effect  can  be  represented  as  a  set  of  (address,  old  value, 
new  value)  triples  containing  every  address  where  the  node  instance 
wrote  something  (even  if  the  old  and  new  contents  are  the  same) 
and  the  contents  of  that  address  upon  entry  and  exit  from  the 
node  instance.  Let  us  call  such  a  set  the  "output  set"  of  that 
node  instance . 

We  also  note  that  reconstructing  the  complete  state  which 
existed  when  node  instance  k-p  was  enteied ,  also  provides  the 
ability  to  replicate  the  execution  of  the  node  instances  k-p 
through  k,  i.e.  we  do  not  need  the  input  sets  for  the  purpose 
of  backtracking;  the  output  sets  are  sufficient.  We  still  do 
need  them  however  in  answering  questions  about  data  flow. 

One  final  observation  I  wish  to  make  is  that  if  an  address 
appears  both  in  the  input  and  the  output  sets  of  the  same  node 
instance,  then  its  value  in  the  input  set  and  its  "old  value"  in 
the  output  set  are  equal.  This  means  that  whenever  the  two  sets 
contain  the  same  addresses,  the  first  two  elements  of  the  triple 
in  the  output  set  (or  equivalently,  the  pair  in  the  input  set) 
are  redundant.  An  empirical  study  of  some  input  and  output  sets 
shows  that  this  redundancy  is  almost  complete,  i.e.  with  very 
few  exceptions,  every  address  which  appears  in  an  output  set  also 
appears  I n  the  corresponding  input  set.  This  means  that  when  we 
restore  the  input  sets  of  the  last  n  instances  in  reverse  chrono¬ 
logical  order,  we  almost  always  restore  the  complete  machine  state 
which  existed  just  before  the  n-instance  sequence;  however  we 
always  restore  a  sufficient  part  of  the  machine  state  to  guarantee 
an  identical  replication  of  the  execution  if  a  backtrack  is 
requested.  (Important  note:  Here  we  are  neglecting  the  effects 
of  peripheral  devices,  such  as  the  setting  of  status  or  data 
registers.  These  effects  constitute  communication  between  two 
independent  processors,  i.e.  the  I/O  device  and  the  CPU.  DAME 
does  not  offer  facilities  for  backtracking  over  periods  in  which 
such  communication  between  two  processors  occurred.  However, 
such  a  facility  may  be  programmed  by  the  user  and  inserted  as 
addressed  hooks  in  such  device  registers.) 

So  far,  we  have  not  specified  whether  nodes  can  be  overlapped 
or  nested.  In  the  DAME  system,  if  input/output  sets  are  not 
being  used,  nodes  may  be  nested  or  overlapped,  provided  they  do 
not  overlap  at  entry  and  exit  points.  If  input/output  sets  are 
being  used,  overlapped  nodes  are  permitted,  provided  they  do  not 
overlap  at  entry  or  exit  points.  In  particular,  for  example,  a 
subroutine  which  is  called  from  two  different  nodes  constitutes 
a  part  of  each  node  instance  in  which  it  is  called.  If  nodes  are 


42 


nested,  input/output  sets  become  much  more  expensive  to  build. 

In  the  case  of  un-nested  nodes,  each  address  is  tagged  with  a 
single  bit  when  it  is  first  generated,  and  in  subsequent  uses 
of  the  same  address  in  the  same  node  instance,  that  bit  prevents 
it  from  being  entered  again  in  the  input  or  output  list;  i.e. 
a  search  of  the  input /output  set  for  each  generated  address  is 
not  required  to  avoid  repetition  in  the  input/output  sets.  With 
nested  nodes  however,  the  single-bit  mechanism  does  not  suffice. 
One  needs  either  a  "bit  stack"  for  each  address  or  one  has  to 
search  the  input/output  sets  of  each  nested  instanc.’  for  each 
generated  address;  both  are  very  expensive  to  implement.  Hence, ^ 
this  particular  question  may  be  regarded  as  an  "unsolved  problem 
In  the  current  implementation,  DAME  does  not  permit  the  use  of 
I/O  sets  if  nodes  are  nested. 


This  limitation  is  analogous  to  not  having  a  block-structured 
programming  language.  While  the  availability  of  local  variables 
and  dynamic  storage  allocation  with  arbitrarily  short  scopes  in 
a  language  like  ALGOL  or  BLISS  is  very  desirable  in  many  instances, 
the  same  algorithms  can  be  programmed  in  FORTRAN  or  APL,  which 
do  not  have  block  structure,  in  very  similar  ways.  Similarly, 
the  unavailability  of  nested  I/O  sets  did  not  handicap  me 
significantly  in  the  analysis  problems  I  attacked.  I  was  able 
to  get  around  the  problem  by  planning  my  approach  in  terms  of 
one-level  nodes,  just  as  one  does  in  FORTRAN  and  APL. 


3.6  An  Outline  of  DAME  Instruction  Set 


In  this  section,  I  shall  outline  the  DAME  instruction  set 
and  discuss  in  more  detail  the  instructions  particularly  useful 
for  monitoring  and  analyzing  of  the  execution  of  the  PDP-11. 

Where  syntactic  descriptions  are  needed,  a  BNF-like  notation  will 
be  used,  with  "/"  denoting  disjunction,  "<"  and  ">"  delimiting 
non-terminal  symbols  and,  "["  and  " J "  delimiting  optional  operands. 
The  description  is  intended  to  be  easily  understandable  and, 
where  there  is  a  conflict  between  that  objective  and  conciseness 
and/or  terseness,  I  shall  emphasize  intelligibility.  For  those 
who  wish  a  more  detailed  description,  a  document  called  Introduc¬ 
tion  to  DAME",  which  also  serves  as  a  User  Manual,  is  included 
In  the  Appendix. 

DAME  instructions  can  be  executed  immediately  or  given  a 
name  and  saved  for  deferred  execution.  The  latter  are  referred 
to  as  DAME  routines.  They  can  be  defined  on-line  or  retrieved 
from  a  text  file  by  special-purpose  DAME  instructions. 

The  syntax  of  a  DAME  instruction  is: 

< DAME  instruction>  *  <Type-l  instruction^  /  <Type-2  instruction'* 


4  3 


Tvpe-1  instruction-  *  < operator^ (<operand  list  ) 

Type-2  instruction-  »  -operator >( -operand  list  action  ) 

-operand  1  i  s  t  *  -operand  /-operand  list-  -operand- 

-operand--  *  -octal  integer?/ 

s'hort  char,  string/ 

-global  -10  symbol?/ 

-object  name  ^ 

action  *  -DAME  routine  name?  /  -compound  instruction 
short  char,  string >  »  -up  to  5  characters 

-compound  instruction-  ■*  (  DAME  instruction  list?) 

•DAME  instruction  list-  *  -DAME  instruction  / 

-DAME  instruction  list--  -DAME  instruction? 

As  can  be  seen,  some  DAME  instructions  take  simple  operand 
lists  while  others  (in  particular,  IF,  INCP,  WHJ  ,  HOOK  and  ALONG 
instructions)  can  optionally  take  the  name  of  a  DAME  routine  or 
a  compound-instruction  (the  analogue  of  a  compound  statement  cr 
compound  expression  in  block-oriented  languages)  to  be  executed, 
as  the  last  operand.  All  operands  of  a  DAME  instruction  must 
be  defined  prior  to  the  execution  of  that  instruction.  Objects, 
which  are  not  pre-defined  by  the  system,  are  defined  by  the 
Create  (CR)  instruction  (except  for  DAME  routines,  hooks  and 
value-trace  objects,  as  described  later.)  The  form  @  octal  integer 
refers  to  the  contents  of  -11  core  location  -octal  integer  at 
the  time  the  DAME  instruction  containing  the  form  is  executed. 

3.6.1  General  Purpose  Computation  Instructions 

DAME  provides  a  complement  of  instructions  corresponding  to 
the  usual  constructs  used  in  programming,  to  wit:  assignment, 

arithmetic  and  logical  operations,  looping  and  conditional  execution, 
subroutine  calling  and  I/O.  I  give  an  undetailed  list  of  these 
instructions  here  in  order  to  convey  their  basic  functions  and 
appearance.  A  detailed  description  of  their  effects  is  given  in 
Appendix  A. 

Create  object: 

CR(’-obj.name>  [-class-  -subclass>  -size?1) 

(e.g.  CR(’A)) 

Delete  object: 

DEL(-obj . id?) 

(e.g.  DEL (A) ) 

Insert  in  object: 

IOBJ(-target  -word  no. '  -value>) 

(e.g.  IOB J (A  0  2)) 

Insert  indirect  in  object: 

II0BJ(-target>  -obj.id  word  no. >) 

(e.g.  I  I  0  B  J  ( A  B  0)) 


Insert  in  PDP-11  address: 

I(<address>  rvalue  >) 

(e.g.  1(10000  54)) 

If-then-else : 

IF(^opdl>  '<rel>  <opd2>  <t hen-ac t  ion >  [  <  e  1  se -ac  t  ion  >  ) ) 

(e.g.  I F ( A  ' GT  B  (I0BJ(A  0  B))  (T0BJ(B  0  A)))) 

Wh i 1 e -d  o : 

WHL(<opd>  <action>) 

Incr-from-to-by-do  : 

INCR( <var  ^from-opd  *  <to-opd>  <step-opd>  <ac  t  ion  ' ) 

(e.g.  I N  C  R  ( A  10000  10040  2  (I(A  0)))) 

Execute  DAME  routine: 

EX(<routine>) 

(e.g.  EX(ROUTl)  execute  routine  ROl’  T 1 ) 

Push  parameter: 

PUSH  (<value>) 

(e.g.  PUSH(A)  cush  contents  of  A) 

Pop  parameter : 

P0P(<obj  .  id>) 

(e.g.  POP(B)  pop  into  B) 

Return  K  levels: 

RET ( < 1 e ve 1  count>) 

(e.g.  RET(N)  exit  N  levels  of  nesting) 

Type  out  object: 

TOBJ (<obj . id>) 

(e.g.  TOBJ(A)  type  object  A) 

Type  object  indirect: 

TIOBJ ( <ob j  .id  ) 

(e.g.  TIOBJ(A)  type  object  pointed  by  A) 

Type  PDP-10  symbol: 

TY 1 0 ( < global  variable  id  > ) 

'’e.g.  TY10(PC)  type  contents  of  program  counter) 

Type  contents  of  PDP-11  addresses: 

T(<  start  ing  ad  d  r  e  s  s  >  [ -"end  i  ng  address>l) 

(e.g.  T(  10000  A)  ) 

Type  immediate  : 

TI(«.literal>) 

(e.g.  T I ( ' ABC) 


type  the  char. string  "ABC") 


Write  disk: 

WDSK(<-obj  .  i  d  > ) 

write  the  contents  of  A 
in  file  USER. DAM) 


write  the  contents  of  all 
node-objects  in  file  USER. DAM. 
Recall  that  MN'ODESC  contains 
a  pointer  to  the  node- subclass 
master  list) 


read  a  word  into  object  A 
from  file  USER. DAME.  Read 
and  write  operations  on  the 
same  file  can  not  be  intermixed 
without  closing  the  file.) 

Generalized  unary  operation  with  assignment: 

UA(’<unary  op  .  >  <-target>  <opd>) 

•unary  op . >  -  SUC/PRED/ SAL/SI ZE/ ADDR/NOT 
(e.g.  U A (  SUf  A  B)  put  in  A  the  address  of  the 

successor  of  B) 

Generalized  binary  operation  with  assignment: 

BA('<binary  op  .  >  -'target>  <opdl>  <opd2>) 

<binary  op  .  >  -<•  +/-/*/<  s  1  ash  >/ AND  /  OR/ XOR/ ! 
where  •'slash>  denotes  the  division  operator  "/" 

(e .g .  BA( '+  A  A  B)  add  B  to  A) 

Execute  external  routine: 

XX ( •  P  DP  - 1 0  routine  id>  [ < p a r am . 1 i s t >  ] ) 

(e.g.  XX (TYPLI S  10000)  execute  TYPL I S ( 1 0000) ) 

Execute  external  routine  and  assign  returned  value: 

EVAL (<target>  <PDP-10  routine  id>  T < p a r am . 1 i s t *  ] ) 

(e.g.  E  V  AL  ( A  TYPLIS  10000)  A  *•  TYPL  I  S  ( 1  00  00 )  ) 

Get  the  value  of  simulation  time  and  assign: 

TIME(-'target>  ’  <  s  c  a  1  e  >  ’  <  t  y  p  e  >  ) 

< s c a  1 e >  *  MICS/MILS 
'tvpe>  -  FIX/FLOAT 

(e.g.  TIME(A  MICS  FIX)  Insert  in  A  the  simulation 

time  in  microseconds,  as  an 
integer) 


(e.g.  WDSK(A) 


Write  disk  indirect: 
WIDSK(<obj .  i  d  > ) 

(e.g.  WIDSK(MNODESC) 


Read  disk: 

RDS K  (<obj  . id>) 
(e.g.  RDSK(A) 


4  6 


Plot  character: 

P 1. 0  T  (  <  position’  '  <  c  h  a  r  ) 

(e.g.  PL0T(5  '  X)  type  char.  "X"  in  column  5 ) 

ihe  DAME  language  was  designed  to  provide  a  simple  syntax 
in  order  to  minimize  syntax  errors  in  the  analysis  process  and 
to  facilitate  its  translation.  It  was  also  intended  to  be  a 
low-level"  language  into  which  a  higher  level  analysis  language, 
such  as  the  one  discussed  later  in  Chapter  6,  could  he  compiled. 

While  I  shall  leave  the  undefined  non-terminal  symbols  and 
most  of  the  semantics  of  the  above  instructions  un elaborated, 
a  few  explanations  are  in  order.  Wherever  a  numeric  argument  is 
expected,  if  a  name  is  supplied,  its  contents  are  taken.  The 
non-terminal  '■target  '  denotes  the  name  or  address  of  an  object 
into  which  the  assignment  is  to  be  make.  The  syntax  of  the  gene¬ 
ralized  unary  and  binary  operations  with  assignment  is  admittedly 
very  awkward,  but  it  permitted  me  to  save  some  code  in  interpreting 
the  operands  for  each  operation. 

In  addition  to  these  instructions,  since  the  fundamental 
data  structures  used  by  DAME  are  lists,  there  is  a  set  of  list 
manipulation  facilities.  Some  of  these  are  provided  by  POOMAS 
and  are  accessible  via  the  AX  and  EVAL  instructions  listed  above. 
These  are  routines  for  creating,  deleting  and  maintaining  lists. 
DAME  provides  facilities  for  taking  the  union,  intersection  and 
set  difference  of  two  lists  and  assigning  the  result  to  a  third 
list,  in  a  syntax  similar  to  the  preceding  instructions.  It 
also  offers  a  unique  "Search  List"  instruction  whose  syntax  is: 

SLIST(‘target>  'list  id>  ^search  spec.'') 

'search  spec  >  •*  <action> 

which  works  as  follows:  DAME  pushes  the  address  of  the  first 
element  of  the  list  <llst  id>  on  the  "data  stack".  (The  Push 
and  Pop  instructions  listed  above  operate  on  this  stack.  There 
is  a  second  stack,  called  the  "monitor  stack",  used  for  DAME 
routine  calls.)  Control  is  then  passed  to  the  DAME  instructions 
specified  in  <search  spec.-*.  In  the  <  search  s  p  e  c  .  >  ,  the  user 
must  obtain  the  stacked  address  with  a  POP  instruction.  He  can 
then  perform  arbitrary  computations,  preferably  without  further 
manipulating  the  stack.  If  he  wishes  to  end  the  search,  he 
PUSHes  a  1  by  DAME,  in  which  rase  the  stack  will  be  popped  bv 
DAME,  the  address  of  the  current  element  in  the  list  will  be 
stored  in  ‘target  and  the  Instruction  execution  terminated.  If 
the  user  wishes  to  continue  the  search,  he  can  PUSH  a  0.  In 
this  case,  after  the  stack  is  popped,  if  the  end  of  the  list  is 
reached,  DAME  will  insert  a  36-bit  -1  in  ‘'target  and  terminate 
the  instruction.  Otherwise,  the  address  of  the  next  element  of 


the  list  will  be  PUSHed  and  the  cycle  will  be  repeated.  For 
example,  the  instruction 


S L I S T ( A  LISTA  (POP(B) 

1 1  OB J ( C  B  3) 

I F ( C  '  GE  5 

(PUSH  (]))  (PUSH  (0))))) 

would  search  LISTA  for  an  element  whose  fourth  user  word  contains 
5.  If  such  an  element  is  found,  its  address  is  returned  in 
object  A;  otherwise  A  will  contain  a  —1. 

j — — - — - Execution  Monitoring  and  Analysis  Instructions 

In  this  section,  the  subset  of  the  DAME  instruction  repertoire 
which  perform  the  functions  essential  to  providing  the  wide 
range  of  facilities  desirable  in  a  general  purpose  execution 
anaysis  facility  is  described.  The  style  of  the  exposition  is 
again  narrative  and  informal  to  give  a  good  intuitive  understanding 
°  the  primitive  operations  and  data  structures  involved.  Starting 
with  the  instructions  for  inserting  hooks,  defining  nodes  and 
creating  input/output  sets,  I  shall  describe  instructions  for 
searching  input/output  sets,  restoring  node  instances,  "instant 
replays  monitoring  specific  paths  of  control  flow,  automatically 
collecting  the  last  k  values  of  a  location  and  addressing  them 
via  an  operation  similar  to  indexing,  as  well  as  the  instructions 

tor  typing  out  node  objects  and  node  instances  which  have  been 
mentioned  before. 

The  Hook  Mechanism,  described  earlier,  is  used  to  insert 
oo  s  tc  perform  the  user-specified  actions  at  user-specified 
t  mes.  Instructions  for  manipulating  hooks  are: 

HOOK(^hook-type>-'action>  [^address  range>]  -hook  name>) 

(e.g.  HOOK ( ' 1 C  (TOBJ(A))  ’HIC)  Type  the  contents  of  A 

after  every  instruction) 

DEL('hook  name>) 

DISAB(<hook  name>) 

ENAB(<hook  name>) 

(e.g.  DEL(HIC),  DISAB(HIC),  ENAB(HIC)) 

(Mote:  Brackets  [,  J  indicate  optional  operands.) 

These  will  insert,  delete,  disable  or  enable,  respectively, 
a  hook  named  <hook  name>.  The  <address  range>  is  onlv  required 
for  addressed  hooks. 


48 


.Creation  of  Nodes  and  Input/ Outpu  t  Sets 

in  J'hvre  Mechanism*  also  described  earlier,  can  be  evoked 
.  °nG  °f  tWO  wa^s:  vla  the  NODI-!  instruction  or  via  the  NTR 
instruction.  The  syntax  for  the  former  is: 

NOD  E  ( •  add  r  ess  ran^e'-Miode  name  ) 

(e-g.  NODE  ( 20000  20100  ’NOI'EA)) 

The  execution  of  this  instruction  will  cause  a  node-object 
o  name  node  name  to  be  created.  The  format  of  the  user-words 
of  a  node-object  is  given  in  the  following  figure.  Each  node- 
olject  contains,  among  other  data,  a  pointer  to  each  of  two  lists 
s  inpu  -set  list  (1SL)  end  its  output-set  list  (OS  I.).  If  the 
rude  has  not  been  executed  as  vet,  these  lists  are  empty.  An  input 
or  output)  set  consists  of  a  list  of  ten-word  objects.'  An 
Udcress,  value)  pair  is  inserted  into  each  word,  starting  with 
he  first  word  of  the  first  object.  When  and  if  the  first  object 
is  full,  a  second  object  is  created,  and  so  on.  1  he  list  head 
contains  one  user  word  which  contains  the  index  of  the  first 
empty  word  in  the  last  object.  All  unused  words  contain  zeros. 

The  high  order  bit  of  each  word  contains  a  1  if  only  a  byte  was 
accessed,  0  otherwise.  (The  only  variation  to  this  rule  is  in 

t  allTbit  d/  Pr°u?SS°r  St'ltllS  W°rd  PS-  Since  the  FS  ^  esse:- 
bit-addressable,  an  indication  of  which  bits  were  read  or 

* .  le  1  s  needed.  lo  do  this,  we  can  take  advantage  of  the  fact 

hat  O"  y  the  lower  8  bits  of  the  PS  ore  usable  by  the  II 

b  ,  f  "2  ""  ‘"P-t/o-.pu.  set,  a  bit  task  In  the  upper 

Ih-weveJ  'lrt  mrUcte,  which  bits  were  accessed. 

,  ’  h*  eature  is  not  implemented  and  the  PS  is  treated 

reeiste)s°  address.)  The  PS  and  all  the  general  and  device 

,  '  rMT'-  by  thelr  cons'’1'  addresses.  The  format 
i.M  s  and  OS I.  s  is  given  in  Figure  3  3. 


To  provide  for  more  flexibility  in  the  use  of  I/O  sets 
separate  instructions  to  initialize  and  build  I/O  sets  have’been 
provided.  Since  the  building  of  these  sets  adds  quite  a  bit  of 

standard  IamI  °X?CUt !°n *  1  have  found  11  usef«l  to  prepare  a 
standard  DAME  routine  in  a  text  file  which  I  can  evoke  onlv  when 

...  ’ 1,1  conf,truct  I/O  sets.  The  instructions  provided  for 

this  purpose  are  IIS<)  BIS()  and  OIS()  to  initialize,  build  and 

finer  ii"PUJ  SetS>  I0SO'  B0S()  and  C0S(>  to  perform  the  same 

parameter  (The  parenthesls  pairs  indicate  e.ptv 

parameter  Huts  and  are  renuired  by  -he  syntax  of  the  language.) 

lor  example,  the  following  hook  causes  the  i n i t i a J i z a t i on 
o:  a  new  input-set  at  each  node  entry. 

HOOK  (  ’  N  E  (I  ISO)  '  HN  K  ) 


Illustration  3.3 


INPUT-SET  LIST  (ISL) 
for  a  Node 


ISL  p  t  r 

f  rotn  node-ob  j  . 


IS  ptr  from 
Node-trace  table 


address 


value 


to  disable  or  enab^  t  ?  ^ho  A ‘  l* .  ^  0bjeCt  and  be  USed 
enable  the  hook  later,  e.g.  by  the  instruction 

DISAR(HNE)  or, 

ENAB(HNE) . 

Trace!hNTR()  lninJtrucMn0deSrCanube  defaulted  bY  using  the  Node 
DAME  bv  monitorino  °n ‘  In  tbis  case>  nodes  are  defined  bv 

tial  fLw  for  ll  control  flow.  When  it  varies  from  sequen- 

branch  is  execu  t  ed^  the"  ^  ^  unsuccessf  conditional 

a  new  node  in<;fa,  *  curre"t  node  instance  is  terminated  and 

on  o  L  18  entered-  ^  the  new  instance  is  the  first 

;;;  "  '  ‘  *«'*•■•  a  "a“  object  Is  created.  Thus,  In 

subroutine  call  instruction'  ° I /o" Se t ‘  "lth  “  branch  or 
sane  nanner  as  when  t^'JoSp  1 n illll  1  “ IT.l 5!““ " *" 


Searching  of  Executed  Nodes  and  Input/Onn 


u  t  Sets 


the  „Sde!tr^e  “u*'!  °f  tbe  j  ec  t  s ,  the  I/O  sets  and 

whatever  .  “j  ^scribed  earlier,  the  user  can  extract 

f.clli"."  "'i".  ^  aTt'  °n1?e,needS  bV  USi'”'  tha  la"«“a^ 
to  facilitate  nnCf  '  list  processing  operations.  Howeve 

intended  sp  e  c  i  f  i  ca  1 1  v"1  f  o  r  tVp6S  C'f  analyses,  a  set  of  instruction: 
provided  Tk!  Y  for  marching  these  data  structures  is 

Output  Set(FOSET)lnFindCN1dnSnkre:  Find  InpUt  Set  (FISET)»  find 

( F  N I )  ,  Find  Salue^FiAL)  Find  'Vi?1  N°de  InSta"Ce 

Values  f  P  7  ay  r  'i  n  ^  AL)  ’  Find  Value  Indirect  (FIVAL),  Playback 
(RPLAY)  T vd e  N  /St°TB  Node  *nstance  (REST),  Replay  Node  Instance 

I  shall’now  describ^each'of  th^  ^  TyPe  Node  °bjects  (TN0)- 

escribe  each  of  these  instructions  in  detail. 

input  Tset-^fiiF^f^~  (FISET>  inStructio"  attempts  to  find  an 

f  1  able  condl  tions'are'the  Ih*  S»aa‘- 

sets  are  to  ,  ,  fication  of  the  node  whose  input 

of  the  search  ®earc^ed>  tbe  starting  point  and  the  time  direction 
1  /he  search  (i.e.  forward  or  backward  in  execution  hisfnrvl 

and  a  predicate  which  should  be  applied  to  each  input  set  "xhe 
the  8a-f°r  SpGcifyinR  the  predicate  was  of  some  concern’  since 

to  oLri  rr:h^eUld  we7arbitrarJ]y  COmpleX  «d  “  -Tu^si^e 

f.  .I  ^  e  new  language  for  this  purpose.  The  techninne 

u:  i?ti  tirrlhi*  rv  thl 

and  access  Ih,  ..J  contained  ”^1  aL'to  llV ^  ^  ‘I""' 

FISET(- object  id>  <node  spec.>  <search  spec.> 
l<direction>  [<starting  index>j]) 


52 


Let  us  ignore  all  the  operands  except  <search  spec  .  >  for 
the  time  being.  ‘'search  spec  .  >  must  be  a  DAME  routine  name 
or  an  explicit  instruction  sequence  (similar  to  a  "compound 
statement"  or  "compound  expression"  in  some  programming  languages). 
Before  the  ‘search  spec . >  is  entered,  the  system  locates  and 
internally  Pl'SHes  the  address  of  the  next  input  set  to  be  searched 
(PUSH  and  POP  were  described  in  the  preceding  section).  The 
user  must  obtain  this  address  by  a  POP(A)  instruction,  where 
A  is  some  object  name,  which  puts  the  address  of  the  input  set 
to  be  searched  in  object  A.  Then  the  contents  of  address  K  in 
that  input  set  can  be  extracted  and  saved  in  some  object  B,  by 
the  Find  Value  instruction,  as  F I VA  L  ( B  K  A).  The  user  can,  in 
this  i.  anner,  obtain  the  contents  of  any  address  in  the  input  set 
pointed  by  A,  and  perform  calculations  on  them  using  the  language 
facilities.  If  he  is  finished  with  the  search  (e.g.,  he  has 
found  the  input  set  he  is  looking  for),  he  Pl'SHes  a  1;  otherwise 
he  PUSHes  a  0.  After  the  last  instruction  in  ‘search  spec . >  has 
been  executed,  the  system  will  POP  the  stack.  If  the  value  is 
0,  then  if  the  end  of  the  node  trace  has  been  reached,  it  will 
insert  a  -1  in  ‘object  id>  and  will  terminate  the  FISET  instruction. 
If  the  value  is  0  and  the  end  of  the  node  trace  has  not  been 
reached,  it  will  push  the  address  of  the  next  input  set  to  be 
searched,  proceeding  in  the  direction  specified  by  ‘direct ion> 
and  re-apply  ‘search  spec  .  > .  If  the  popped  value  is  a  1,  the 
index  of  the  node  instance  just  searched  will  be  inserted  in 
‘object  id>  and  the  instruction  will  be  terminated.  Thus,  after 
the  FISET  instruction,  ‘object  i d >  will  contain  either  -1,  which 
indicates  that  no  input  set  satisfying  the  specifications  was 
found,  or  it  will  contain  the  address  of  the  first  acceptable  set. 

To  illustrate  the  use  of  this  instruction,  suppose  at  some 
point  in  the  execution  we  wish  to  find  the  most  recent  input  set 
where  the  contents  of  location  1000  equal  the  contents  of  location 
2000,  and  put  the  address  of  that  input  set  into  some  object  D. 

To  do  this  we  shall  need  three  more  objects  (in  fact,  we  could 
get  by  with  one  by  using  the  same  object  for  various  purposes 
but  we  shall  not  do  so  here).  The  following  instructions  create 
these  objects  and  perform  the  required  search: 

CR('A)  CR('B)  CR(’C)  CR(’D) 

F I S  ET ( D  '*  ( POP ( A ) 

FIVAL ( B  1000  A) 

FIVAL (C  2000  A) 

IF  (B  'EQ  C  (PUSH(l))  (PUSH  ( 0) ) ) 

)) 

The  symbol  '*  for  'node  spec . >  indicates  that  all  nodes  are 
to  be  searched.  The  syntax  of  the  IF  instruction  is: 

IF(<obj.id>  <relation>  ‘  o  b  j  .  id>  <then-case>[<else-case>]) 


53 


The  Find  Output  Set  (FOSET)  instruction  works  exactly  in 
the  same  way  as  FISET,  except  that  output  sets  arp  searched. 

The  lind  Node  Object  instruction,  whose  syntax  is  FNO(<obj.  id 
-11  address*),  inserts  in  <obj.  id*  the  address  of  the  node 
object  associated  with  •'-11  address*  if  such  an  object  exists. 
Otherwise  a  -1  is  inserted. 

— Node  Instance  ,  FNI(<obj.  id*  <node  id*  •  n  *  [<starting 
index*  1  -direction* J])  ,  will  similarly  insert  in  <obj.  id--  the 
index  of  the  nth  instance  of  <node  id*  searching  the  node  trace 
in  the  direction*  specified  starting  from  < starting  index*. 

The  default  values  for  the  two  optional  operands  are:  "the 
current  node  instance"  and  "backward",  respectively. 


Find  Value  and  Find  Value  Indirect  are  used  to  extract  the 
value  associated  with  an  address  in  an  input  or  output  set  where 
the  I/O  set  address  is  given  in  the  instruction,  and  where  the 

I/O  set  is  pointed  by  the  object  given  in  the  instruction,  respec¬ 
tively. 

The  Restore  to  Node  Instance.  REST (N) ,  instruction  moves 
backward  in  execution  time,  restoring  the  input  rets  of  node 
instances  until  index  N  is  reached;  e.g.,  if  the  current  node 
instance  is  the  Kth  node  instance  executed,  REST (N)  would  restore 
the  last  (K-N+l)  input  sets. 

The  Replay  Node  Instances  instruction  RP LAY ( < s t a r t i ng  index- 
[<■  ending  index*.]),  will  cause  the  restoration  of  the  input  sets 
of  the  node  instances  between  the  specified  indices.  The  simulation 
time  is  also  restored.  The  instances  whose  input  sets  have  been 
restored  are  then  re-executed.  Upon  termination  of  the  last 
instance  the  environment  in  which  the  RPLAY  instruction  was  issued 
is  re-established. 

The  Type  Node  Instances  instruction  TNI ( T •  st ar t  ing  index*] 
count-)  types  the  node  trace  entries  for  abs ( • count  >)  instances, 
starting  at  -'starting  index  ,  and  moving  forward  in  time,  if 
<count*  is  positive  or  backward  if  -count*  is  negative,  where 
abs(x)  denotes  the  absolute  value  of  x. 

The  Type  Node  Objects  instruction,  TNO ( •  add r e s s  1  >  <address2  ...) 
tvpes  out  the  node  objects  associated  with  the  specified  addresses. 

Detecting  Specific  Paths  of  Execution 

I  would  now  like  to  describe  the  instruction  ALONG,  whose 
syntax  is: 


54 


ALONG(^path>  <action>) 

<path?  -*  <node  id?  /  <path>  <node  id> 

<action>  -*  <  D  AME  routine  name?  /  (<instruction  sequence?) 

Suppose  we  have  defined  nodes  Nl,  N2  ,  .  N7 .  Then,  the 

instruction 

ALONG ( N 1  N  5  N7  X) 

would  cause  the  action  X  to  be  taken  if  the  current  node  is  Nl, 
or  if  the  last  two  nodes  have  been  Nl  and  N5  ,  or  if  the  last 
three  nodes  have  been  Nl,  N5  and  N7,  in  that  order.  In  short, 
the  specified  <action?  is  taken  whenever  the  flow  of  control 
could  be  following  the  specified  path.  The  ALONG  instruction 
is,  as  are  all  DAME  instructions,  executable  through  every  type 
of  hook.  Hence  it  provides  a  convenient  facility  for  taking 
selective  action  (e.g.,  tracing)  as  a  function  of  the  locus  of 
control  f low . 

Collecting  and  Accessing  Precious  Values  of  a  Location 

Finally,  a  mechanism  for  automatic  collection  of  the  previous 
values  of  a  location  and  for  accessing  those  values  is  worth 
mentioning.  The  first  action  is  accomplished  through  the  use 
of  two  instructions.  The  first  is  the  Initialize  Value  Trace, 
IVT(<-11  address?  <n?  <obj .  name?),  which  creates  an  object  named 
<obj  .  name?  of  a  special  subclass  and  large  enough  to  hold  <n> 
previous  values  of  location  <-11  address?.  The  second  instruction 
is  the  Value  Trace  Hook,  VTH(<-11  address?),  instruction  which 
causes  the  monitoring  of  values  stored  into  the  location  <-11 
address?  and  maintains  the  last  <n>  such  values  in  a  circular 
buffer  in  object  <obj  .  name?  created  by  the  IVT  instruction.  Then, 
at  any  point  in  the  execution,  the  Kth  previous  value  of  <-11 
address?  is  obtainable  by  a  binary  operator  #  ,  as 

B  A  (  '  //  B  <-11  address?  K). 

The  instruction  BA(<opr?  <target>  <opdl?  <opd2>)  is  the 
generalized  "Binary  Operation  with  Assignment"  instruction  and 
performs  the  operation:  <target?  *-  <opdl?  <opr>  <opd2?  in  infix 
notation.  Thus  the  above  instruction  would  insert  in  <-11  address 
the  Kth  previous  value  of  <-11  address?.  If  K  is  larger  than 
the  number  of  values  declared  to  be  kept  in  the  IVT  instruction, 
an  error  message  will  be  typed  and  no  assignment  will  be  made. 

If  K  values  have  not  yet  been  assigned  to  <-11  address?,  then 
a  special  code  larger  than  2 i 1 6  will  be  stored  in  B. 


■■■•*- — - 


55 


3  .  7  Various  Design  Issues  and  I'nirop  1  em  e  n  t  ed  I  d  e  a_s 

In  this  section,  I  shall  discuss  some  design  issues  which 
arose  in  the  course  of  the  development  of  DAME.  Most  of  them 
are  related  to  improving  the  execution  speed  of  the  simulation 
and  decreasing  the  monitoring  overhead.  I  shall  also  outline 
some  ideas  which  have  not  been  implemented  mainly  because  they 
would  not  contribute  significantly  to  the  research  aspects  of 
this  proj  ect  . 

3.7.1  Representation  of  -11  Core  and  the  Design  of  the 
Hook  Mechanism 

Since  the  representation  of  the  PDP-11  core  and  the  Hook 
Mechanism  lie  at  the  heart  of  the  implementation  plan,  these 
two  points  are  worth  re-pondering  and  alternative  implementations 
worth  considering. 

As  was  mentioned,  an  earlier  implementation  of  the  simulator 
packed  two  -11  words  into  a  -10  word,  one  into  the  low -order 
16  bits  of  each  of  the  lower  and  upper  halves  of  each  36-bit 
-10  word.  In  that  implementation,  the  high-order  two  bits  of 
each  -10  halfword  were  used  to  indicate  the  presence  or  absence 
of  monitoring  actions  associated  with  the  fetch  or  store  of  each 
data  word  (e.g.  a  word  fetched  or  stored  by  an  instruction)  or, 
with  the  fetch  or  completion  of  an  instruction.  The  monitor 
actions  themselves  were  located  via  a  table  look-up  on  the  parti¬ 
cular  address  involved.  A  seperate  table  was  used  for  each  of 
data  fetch,  store,  instruction  fetch  and  instruction  completion 
operations.  This  design  makes  possible  a  substantial  saving  in 
the  core  requirement,  approximately  ((28K/2)-n),  where  n  is  the 
number  of  locations  for  which  a  hook  exists.  The  essential  price 
paid  for  this  storage  saving  is  the  overhead  of  the  table  look-up 
procedure.  Assuming  that  approximately  1%  of  the  locations  are 
hooked  and  a  binary  search  is  used,  about  8  comparisons  are 
needed  to  locate  the  monitor  action  pointer  associated  with  a 
particular  address.  Further  assuming  that  one  address  involved 
in  every  instruction  has  some  monitor  action  associated  with  It, 
this  overhead  is  roughly  equivalent  to  twice  the  overhead  of 
decoding  the  op-code  of  an  -11  instruction.  In  addition  to  the 
monitoring  actions  associated  with  particular  addresses,  there 
are  those  due  to  the  so-called  "general  hooks",  i.e.  actions  to 
be  taken  at  every  fetch,  or  every  store  etc.  Thus,  there  alreadv 
is  substantial  overhead  due  to  monitoring.  So,  the  decision  to 
map  one  -11  word  into  each  -10  word  and  use  the  left  half  of  the 
-10  word  for  a  pointer  to  associated  monitor  actions  was  intended 
to  avoid  further  degradation  in  the  monitoring  overhead,  but 
exactly  how  much  is  gained  in  response  time  in  a  time-sharing 
environment  is  not  clear  since  the  larger  core  requirement  delays 


56 


the  swapping-in  by  the  operating  system  scheduler.  When  the 
word  lengths  of  the  object  machine  and  the  host  machine  are 
equal  and  one  would  like  to  use  a  one-to-one  mapping,  a  scheme 
proposed  by  Rernard  Lang,  [LA  72  i,  called  "Lambda-monitoring", 
can  be  used.  In  this  scheme,  since  there  are  no  additional 
bits  available  to  indicate  the  presence  of  associated  monitor 
actions,  one  inserts  a  special  bit  pattern  (called  "Lambda") 
into  the  word  when  one  wants  to  associate  monitor  actions  with 
it.  Then,  at  every  fetch  or  store,  one  compares  the  contents 
of  the  address  being  accessed  with  the  bit  pattern  Lambda.  If 
they  are  found  equal,  this  is  taken  as  a  signal  that  there  may 
be  some  monitor  action  associated  with  that  address.  Then,  tables 
set  up  for  this  purpose  are  searched,  just  as  in  the  earlier 
schemes.  If  an  entry  for  that  address  is  found,  the  monitor 
action  indicated  by  the  entry  is  performed.  The  table  entry  also 
includes  the  actual  contents  of  that  location.  If  no  entry  for 
that  address  is  found,  no  action  is  taken  and  the  execution  is 
permitted  to  continue. 

This  scheme  is  clearly  very  similar  to  the  scheme  used  by 
current  debugging  systems  which  insert  a  trap  instruction  into 
any  instruction  address  where  the  user  wants  to  put  a  breakpoint. 
The  "Lambda-monitoring"  scheme  simply  extends  this  technique  to 
applv  to  data  elements  as  well  as  instructions. 

In  using  such  a  scheme,  clearly,  "bugged"  locations  must  be 
write-protected  from  the  user;  i.e.  the  data  to  be  stored  into 
such  a  location  must  in  fact  be  trapped  and  re-routed  to  a  special 
register  holding  the  actual  contents  of  that  location.  That 
register  is  the  same  one  whose  contents  are  fetched  upon  a  fetch 
operation  on  the  bugged  location.  The  first  requirement  implies 
that  prior  to  every  store  operation,  the  current  contents  of  the 
store  address  must  be  fetched  and  compared  with  Lambda. 

I  shall  have  more  to  say  about  this  technique  in  Chapter  8, 
when  I  go  into  the  implementation  of  monitoring  features  in 
microprogram  or  hardware. 

3.7.2  Scheduling  with  Look -ahead 

One  of  the  main  bottlenecks  in  the  simulator  is  the  event 
scheduling  process.  As  was  mentioned,  the  time-grain  of  the 
simulation  is  at  the  memo r v / r e g i s t e r  access  level.  The  particular 
simulation  package  which  is  us  ■d  is  a  general-purpose  simulation 
package,  in  which  an  Event  Notice  is  created  for  each  event  to 
be  scheduled  showing  the  time  of  activation  and  the  process  to 
be  activated.  After  each  event,  the  scheduler  consults  the  event 
calendar  and  activates  the  process  indicated  bv  the  first  event 
notice  having  the  earliest  time  of  activation.  In  our  case, 


5  7 


since  there  are  no  simulated  devices  ether  than  the  TTY,  there 
are  usually  only  two  processes  which  receive  and  surrender 
control:  the  CPU  and  the  Unibus.  Further,  the  two  are  never 

active  simultaneously  in  simulated  time.  While  this  design  is 
a  clean  and  consistent  one,  permitting  the  addition  of  new 
devices  to  the  Unibus  in  an  easy  va1'  logically  quite  similar 
to  adding  them  to  the  real  Unibus,  and  also  permitting  studies 
on  bus  utilization,  timing  of  signals  between  devices  on  ' he 
bus  etc.  to  be  done  very  naturally,  it  is  also  quite  expensive 
in  terms  of  scheduling  overhead  due  to  the  event  notice  preparation, 
placement  and  searching  of  the  simulation  calendar. 

A  technique  which  can  be  employed  to  reduce  this  overhead 
is  what  may  be  termed  a  "look-ahead"  technique,  In  which  the 
CPU  checks  the  simulation  calendar  before  it  releases  control 
to  the  scheduler.  If  It  finda  no  events  scheduled  ( e  .  g  .  an  I/O 
device  activity),  rather  than  releasing  control  to  the  scheduler 
which  would  activate  the  I'nibus  next,  the  CPU  performs  the  core 
access  function  Itself,  perhaps  by  calling  a  "routine"  version 
of  the  Unibus  (»s  opposed  to  a  coroutine  version)  which  perform- 
an  identical  function  as  the  coroutine  version  without  the  co¬ 
routine  jump  statements. 

Some  measurements  on  the  gain  in  simulation  speed  through 
this  technique  is  reported  in  Chapter  5. 

3.7.3  "Blow-up"  Representation  of  the  Processor  Sta t  u  s  Word 

Another  technique  by  which  the  speed  of  the  simulation  nay 
be  increased  is  reducing  the  amount  of  individual  bit  manipulation 
in  the  handling  of  each  PDP-11  instruction  since  this  is  a  very 
slow  operation  in  the  PDP-10  (at  least,  in  our  model).  A  good 
candidate  for  this  case  is  the  modification  of  the  Processor 
Status  word  (PS),  since  most  instructions  modifv  one  or  more  bits 
in  this  word.  Further,  each  bit  must  be  computed  and  set  seperatelv 
Since  the  PS  is  affected  by  most  instructions,  this  causes  a  good 
bit  of  overhead. 

This  problem  can  be  alleviated  to  a  certain  extent  by 
representing  each  of  the  six  fields  of  the  PS  by  a  seperate  word. 
However,  caution  must  be  taken  that,  in  case  the  user  program 
explicitly  addresses  the  PS,  then  the  result  of  the  read  or  write 
operation  is  reflected  properly  on  the  Unibus  lines  and  the  words 
representing  individual  PS  fields. 


58 


-  •  •  '*  Comp  i  1  a  t  ion  " _ o  f  J3  e  coded  -  1  ]  Instructions 

/'s  it  is  designed,  the  simulator  re-interpretes  every 
PDP-11  instruction  every  time  it  is  executed.  In  particular, 
t’e  extraction  of  the  op-code,  the  operand  addressing  modes, 
the  operand  registers  and  the  selection  of  the  particular  simulator 
routine  to  be  called,  causes  considerable  loss  of  efficiency  in 
each  re-interpretation  of  an  instruction.  What  this  suggests 
is  a  "compilation"  of  each  executed  PDP-11  instruction  into 
PI' P-10  code  tailored  specif  icallv  for  that  particular  -11  instruc¬ 
tion,  in  which  all  variability  lias  been  eliminated.  This  would 
provide  for  much  more  efficient  execution  of  that  -11  instruction 
subsequently.  (The  concept  of  processors  which  can  execute  both 
compiled  and  interpretive  code  has  been  implemented  in  various 
systems,  e.g.  PDP-10  LISP.  Also  see  J.  Mitchell  for  a  good 
discussion  of  this  topic  MI  70.) 

There  are  several  problems  which  must  be  resolved  however. 
f,ne  is  the  fact  that  there  will  be  considerable  overhead  associated 
witl  the  compilation  itself;  therefore,  instructions  which  will 
be  executed  fewer  than  some  number,  n,  times  should  not  be  compiled, 
where  n  is  a  function  of  the  actual  overhead  of  compilation  versus 
field  intei pretation.  However,  in  general,  we  do  not  know  before¬ 
hand  the  number  of  times  each  instruction  will  be  executed.  Hence, 
it  s  difficult  to  tell  which  instruction  to  compile  and  which 
not  to  compile.  One  heuristic  rule  which  can  be  used  is  that 
if  an  instruction  is  used  a  second  time,  it  is  probably  a  part 
f  a  loop  or  a  common  subroutine  etc.  .and  lienee  its  chances  of 
being  used  again  are  good.  Therefore,  a  reasonable  approach  mav 
be  to  compile  an  instruction  the  second  time  it  is  used.  There 
will  clearly  be  some  waste  due  to  the  compilation  of  instructions 
which  are  executed  exactly  twice  or  even  those  which  are  executed, 
siv,  three  or  four  times.  This  parameter,  namely,  the  number  n. 
can  be  examined  more  thoroughly  after  the  compilation  process 
,'lls  been  implemented;  it  may  well  turn  out  that  this  number 
varies  as  a  function  of  the  instruction  class,  e.g.  a  simple 
un  ond It i ona  1  branch  mav  not  be  worth  compiling  at  all,  whereas 
a  double-operand  instruction  may  be  worth  compiling  after  its 
first  execution. 

It  is  clear  however  that  the  simulator  has  to  be  able  to 
execute  both  forms  of  PDP-11  instructions,  i.e.  the  "uncompiled" 
PDP-11  machine  instruction  and  the  "compiled"  version,  which  in 
the  ultimate,  is  a  sequence  of  PDP-10  machine  instructions 
associated  with  the  particular  -11  instruction  location. 

Another  question  which  must  be  resolved  in  order  to  use 
this  technique  is  how  to  associate  the  — 10  code  with  the  appropriate 
-* 1  instruction  address.  One  solution  mav  be  to  insert  the  -1U 


59 


instructions  associated  with  a  particular  -11  location  into 

tV.  dnQ  t0  USe  3  ta^-*e  si^e  k,  containing  pointers  to 

1  si  3  jects,  whore  k  is  the  size  of  -11  memory  containing 
ns  ructions.  The  overhead  required  to  locate  the  required  -10 
code  must  be  minimized  to  make  this  technique  worthwhile. 

,In  the  desij?n  °f  DAME,  there  is  a  particular  feature,  name  1 v 
ie  Association  List  for  each  core  location,  which  solves  this 
proiem  very  naturally.  One  can  insert  the  object  containing 
the  -10  instructions  for  a  particular  -11  location  as  the  first 
element  in  the  Association  List  of  that  location.  If  more  gene¬ 
ra3  t\  is  desired,  one  can  introduce  a  new  subclass,  called  "PDP-10 
code  subclass"  and  insert  an  object  of  that  subclass  anywhere 
in  the  Association  List.  However,  this  will  of  course  increase 
e  search  time.  The  use  of  association  lists  for  this  purpose 
also  obviates  the  need  for  the  large  table  required  bv  the  first 
technique. 

iinaliy,  in  this  connection,  we  must  note  a  problem  with 
sel f-modi tying  programs,  namely  that  if  a  particular  instruction 
is  modified  during  the  course  of  execution,  its  old  "compiled" 
version  rust  be  deleted  and  a  new  decision  has  to  be  made  as 
t'  whether  the  new  version  should  be  compiled.  In  fact,  if  a 
part  iciihar  instruction  will  be  changed  frequently,  it  probablv 
should  not  be  compiled. 


3.7.5  Further 


DA  ME  Instructions 


Another  area  worth  considering  for  improved  efficiency  is 
that  of  further  compilation  of  DAME  irstnictions  into  -10  code. 
Ihis  is  particularly  true  for  heavily  monitored  programs.  At 
present,  DAME  instructions  are  only  "assembled",  i.e.  the  DAME 
op-code  and  symbolic  operands  are  replaced  with  the  -10  addresses 
of  the  routine  to  execute  that  instruction  and  the  addresses  of 
the  operands,  respectively.  Such  things  as  determining  the  number 
operands  and  certain  kinds  of  type  and  size  checkin-?  are  done 
at  run-time.  It  is  possible  to  do  a  large  part  of  this  at  ro-tine 
definition  time  since  object  size,  class  and  subclasses  are 
declared  when  the  object  is  created.  This  would  not  be  possible, 
however,  for  indirectly  accessed  objects. 


.7 -6  A  "Limited-Run  Comp  1  e  t  e  -  T  r  ac  e  11  Fe  a  t  u  r  e 

As  was  described  earlier,  backtracking  to  a  particular 
instruction  n  is  implemented  bv  restoring  in  reverse  chronological 
rder,  the  input  sets  of  node  instances  until  the  one  including 
t h  e  lt3Struction  n  is  restored,  and  then  executing  the  instructions 
preceding  n  in  that  node  instance.  Backtracking  has  been  implemented 
in  a  different  wav  bv ,  at  least,  one  more  worker,  Ralph  Grishman, 


60 


in  the  AIDS  system  at  NYU  Courant  Institute  of  Mathematical 
Sciences.  Ihe  following  description  of  the  implementation  of 
this  mechanism  is  taken  from  R.  Stockton  Gaines*  thesis  .* 

...  The  back-up  mechanism  mentioned  above  is  original  with 
Grishman,  and  is  sufficiently  interesting  to  warrant  a  detailed 
description  of  how  it  is  accomplished.  AIDS  keeps  four  tables 
for  this  purpose;  let  us  call  them  Rl,  R2 ,  SI  and  S2.  As  AIDS 
i  interpreting  the  user's  program  it  goes  through  the  following 
process.  At  some  point  it  saves  the  state  of  the  machine  registers 
in  Rl,  and  after  that  each  tine  the  user's  program  stores  a  new 
quantity  into  a  location  in  memory,  the  previous  quantity  at  the 
location  is  saved  in  SI  together  with  the  address  which  is  being 
changed.  When  SI  is  full,  the  registers  are  stored  in  R  2 ,  and 
execution  continues  with  AIDS  saving  the  previous  values  and 
addresses  to  which  stores  are  made  in  S2.  When  S2  is  full,  the 
process  starts  over  again  with  Rl  and  SI,  and  so  on.  When  the 
user  issues  a  request  to  back  up,  AIDS  fetches  the  most  recent 
item  from  SI  or  S2  and  puts  it  back  where  it  was  originally.  It 
then  puts  back  the  next  most  recent,  and  so  on,  until  it  has  put 
back  the  first  quantity  saved  after  the  next  to  last  time  the 
registers  were  saved.  At  this  point  it  can  restore  the  registers 
to  the  values  they  had  the  next  to  last  time  they  were  saved, 
and  AIDS  can  roexecute  the  program  from  that  point  to  the  interrupt 
at  which  the  back-up  was  requested,  since  the  program  and  its 
storage  are  now  in  the  same  condition  they  were  when  the  program 
reached  that  point  for  the  first  time..." 

While  I  believe  that  the  node  mechanism  and  the  input/output 
set  concept  of  DAME  have  significant  advantages  over  this  method 
in  terms  of  storage  requirement  and  the  ease  vith  which  the 
collected  information  may  be  used  in  data  flow  analysis,  there 
are  times  at  which  the  user  would  like  to  see  a  complete  trace 
of  certain  portions  of  his  program.  At  present,  this  can  be 
done  in  DAME  by  attaching  general  hooks  to  fetch,  store  and 
instruction  completion  events  to  type  out  the  required  information. 
Alternatively,  if  the  number  of  instructions  to  be  thus  traced 
is  small,  each  instruction  can  be  declared  a  node,  in  which  case 
the  node  mechanism  will  construct  the  input  and  output  sets  for 
each  instruction.  Nevertheless,  it  may  be  desirable  to  have  a 
detailed  trace”  mode  in  which  every  memory  and  register  access 
is  recorded  in  a  "trace  object"  This  would  be  useful,  for  example, 
in  directly  answering  questions  like  "What  was  the  second  value 
assigned  to  X  in  node  N?",  or  "What  was  the  value  of  X  at  instruc¬ 
tion  I ?  ',  without  the  restoration  of  the  required  input  sets  etc. 
However,  such  a  facility  would  have  to  be  used  in  a  highly  selective 
and  judicious  manner  since  it  would  require  a  great  deal  of  storage 
and  CPU  time  overhead. 


61 


CHAPTER  4 


ILLUSTRAT jVE  EXAMPLES  OF  SOME  APPLICATIONS  nr  D A M F 


D . M.  thi"  chapter*  1  shall  illustrate  the  main  features  of 

f  a  m  i  1  i  a  r  i  t  v  1 1  h  G  t  h  °  ^  eXfmples  of  its  ^Plication.  A  modest 

,  1 l  Wlth  the  architechture  of  the  PDP-11  will  be  helpful 

be  explained  as  necessary  '  ASSembly  lan*Ua*e  notation  will 

nr  T  - 

Ini  t  laliziH  S  °f  flo^in«  a  Program,  inserting  hooks, 

second  pvannl"  °\  _1  C°re  ^  th°  lnitiation  of  execution.  The 

matrix  v  u  demonstrates  the  construction  of  a  node  transition 

node  i  'to  0hH°Se.ele-nt  M(1’-i)  is  thc  "«">«>«  of  transitions  from 
d  e  „  6  f  beginning  of  the  execution.  In  the 

t  1  iliU  ii!  i0rer  bOUnd  f°r  the  time  On  terms  Of 

given  ^  i  7*  required  to  execute  a  recursive  program 

in  c  the  «rr  f  "b!r  °f  ldentlcal  processors,  is  calculated; 
S  md  T  f  thP  PX<?Cutlon  tree  is  determined  and 

tV  :  f°“rt’  ex««Plp  demonstrates  the  construction  of 

X  fN.M),  each  element  of  which  is  a  triple  (a  ,a  ,a  )  where 

\  18  ^  addrpss  wl>ich  is  in  the  intersection  of  the  input  set 

of  the  ith  instance  of  node  N  and  the  output  set  of  the  preceding 

U- UthYj,  "  n°de  M  hos  executed  between  e1’ 

<■  nth  and  1th  instances  of  s,  then  X  is  empty.  a  is  the  value 

read  free  location  a  by  N  and  a  is  t‘he  value  of  *  at  the  exit 

.  1  i  3  i 

iron  the  preceding  instance  of  M. 

Those  four  examples  are  intended  to  provide  illustrations 

which  areCtfe’SCS  °f  C°ntro1  flow’  data  Oow  a"d  performance, 

is  suitable  'mi5  ?  3nalvSeS  f0r  Which  a  s^tem  Hke  DAM F 

he  var  ’anil  i  f  USG  tb°  S3me  PDP"1]  Program  to  demonst 

the  various  analysis  techniques.  For  the  purpose  of  simplicity 

n  exposition,  the  chosen  program  is  a  small  one--  a  one-page 

quicksort  routine.  Its  code  is  given  and  explained  in  Example  1 

Clt{,  f*fth  e xamP  1 e  is  not  one  for  which  DAME  is  particularly 

w  d  "Jt’c  in"18  inCl?dCd  hrC  t0  Sh°V  that  —  cases  wiuJ' 

W  uld  Strain  a  simulator-based  software  monitor  system,  one 

.  °  nakt*  useful  analyses  by  exercising  some  intelligence 

its  use.  This  example  deals  with  collecting  instruction  mix 


r  a  t 


62 


and  addressing  mode  usage  statistics  on  several  PDP-11  programs. 

The  collected  statistics,  while  they  are  interesting  and  possibly 
useful  in  their  own  right,  were  used  to  project  the  running 
times  of  the  same  programs  on  a  PDP-11/40  and  -11/45. 

Example  1.  Nodes  and  Input/Output  Sets  of  a  Quicksort  Pi_o_gran 

As  a  first  example,  let  us  consider  the  PDP-11  assembly 
language  program  QUICKSORT,  whose  text  is  given  in  the  next 
illustration.  (For  a  specification  of  the  DEC  assembly  language 
see  DEC  71  .)  The  program  implements  a  simplified  version  of 

the  "quicksort"  algorithm  as  given  by  Knuth  in  [KN  73  .  The 
code  given  here  was  compiled  by  the  BLISS-11  compiler  [DEC  73] 
to  be  assembled  by  the  MACX1 1  assembler.  To  explain  briefly  some 
of  the  notation  in  the  assembly  language:  (f  denotes  immediate 
operands,  RSi  means  register  i,  SP  is  the  stack  pointer  (register  7), 
@  denotes  indirect  addressing,  -(K)  denotes  automatic  decrementation 
of  register  K  before  its  contents  are  used,  and  (K)+  denotes 
automatic  incrementation  of  register  K  after  its  contents  have 
been  used.  All  i n  t  e  g  e  r  s  are  in  octal.  The  syntax  of  double-operand 
instructions  is: 

'opcode>  '•source-operand-*,  'destination  operand-* 

(All  integers  are  in  octal.) 

The  program  consists  of  two  parts:  a  recursive  subroutine 
called  QSORT  located  between  (relative  addresses)  0  and  166,  and 
the  main  program  between  170  and  204.  The  main  program  expects 
two  integers  in  registers  0  ard  1,  which  are  to  be  the  bounds  of 
the  core  locations  whose  contents  are  to  be  sorted.  It  simply 
pushes  these  parameters  on  the  stack  (which  grows  downward  from 
its  initial  value  of  1400)  and  calls  the  subroutine  QSORT.  This 
subroutine  works  as  follows: 

It  uses  Rl  and  R2  to  point  to  the  lower  and  upper  bounds, 
respectively,  of  the  vector  to  be  sorted.  If  Rl  is  greater  than 
or  equal  to  R2,  there  is  no  sorting  to  be  done;  hence  it  returns. 
Otherwise,  it  compares  the  elements  pointed  by  Rl  and  R2.  If  no 
exchange  is  necessary,  R2  is  decremented  by  1  and  the  process  is 
repeated.  After  the  first  exchange  PI  is  incremented  by  1  (Note: 
Since  sorting  is  done  in  units  of  words,  the  addresses  are  really 
incremented  by  2).  Comparison  with  the  element  pointed  by  R2 
and  incrementation  continues  until  another  exchange  occurs,  at 
which  point  R2  is  decreased  again.  The  sorting  goes  on  this  way, 
"burning  the  candle  at  both  ends",  until  Rl  and  P2  point  to  the 
same  element.  During  this  rrocess,  the  value  which  was  initially 
pointed  bv  Rl  has  been  exchanged  every time  the  direction  was 
switched.  When  R1=R2,  this  value  will  have  found  its  final  position: 


63 


i.e.,the  position  it  must  have  in  the  completely  sorted  vector. 
(Ihe  interested  reader  can  convince  himself  of  this.)  Further, 
this  element  now  divides  the  vector  into  two  parts,  namely, 
that  to  its  left  and  that  to  its  right.  These  two  parts,  which 
Knuth  calls  "subfiles",  can  be  sorted  with  the  same  procedure. 
Hence,  QSORT  then  calls  itself  tv? ice,  to  sort  first  the  lelt 
subfile  and  then  the  right  subfile. 


The  -11  code  given  in  Illustration  4.1  was  produced  by  the 
BLISS-11  compiler  and  the  comments,  preceded  bv  the  symbol  "!", 
were  inserted  later  by  hand. 


In  this  example,  I  shall  load  the  -11  program,  which  is 
stored  in  a  file  named  QSORT,  initialize  a  vector  of  40  elements 
to  be  sorted,  set  the  default  mode  for  node  definition  and,  at 
cver\  node-exit,  tvpe  out  the  node-object  and  the  current  input/ 
output  set  for  the  first  five  node  instances.  Then  1  will  let 
tlie^  program  run  to  completion.  The  monitor  ’’nstructions  which 
wil  be  used  to  do  this  are  contained  in  a  file  called  DEM01 , 
which  is  listed  in  Illustration  4.2.  In  this  file,  there  are 
two  routines:  I'EMOl  and  TIO  (for  Type  I/O  Sets).  DEM01  loads 
two  copies  of  the  Quicksort  program,  one  starting  at  location 
2  J00  this  copv  is  the  one  which  is  executed)  and  another 
starting  at  location  30000  (this  copv  will  be  used  as  data:  a 
snail  part  of  it  will  be  sorted.);  it  initializes  registers  R0 
and  Rl;  loads  another  routine  DEFIO  from  a  file  called  DEFIO  and 
executes  it;  defaults  the  node  definition;  creates  an  object 
C  to  be  used  for  counting  to  5;  initializes  it  to  0;  inserts 
a  hook  to  call  the  routine  TIO  at  every  node  exit;  gives  control 
to  the  PDP-11  starting  at  location  20170,  the  address  of  the  main 
program.  (All  relative  addresses"  in  this  example  are  relative 
to  2J000.)  - - -  - - 


Ihe  routine  DEFIO  (listed  at  the  bottom  of  illustration  4.2) 
is  a  standard  routine  for  constructing  input/output  sets.  It 
works  as  follows: 


The  svmbols  Cl'RNQBJ,  CISP  and  COSP  used  in  TIO  are  global 
PDP-10  variables  which  point  to  the  current  node-object,  the 
current  input-set  and  the  current  output-set,  respectively.  (As 
a  practical  matter  in  the  use  of  DAME,  if  the  monitor  routines 
to  be  used  turn  out  to  be  long  or  if  we  aren't  sure  they  are 
correct,  it  is  a  good  idea  to  prepare  them  as  text  files  and 
load  them  at  run-time  rather  than  define  them  on-line,  during 
f  <  analysis  sessi-n.)  The  format  of  type-out  for  objects  and 
lists  is:  The  wo  ids  of  an  object  are  typed  between  slashes.  If 
a  word  is  not  zero,  it  is  typed  as  <left  half>,  ,  right  half', 
otherwise  it  is  typed  as  0.  Thus,  node-objects  are  tvped  out  as: 


< starting  1  oc . >  ,  ,  < end  ing  lcc. >  /<no.  of  instr.  in  it>,, 
no.  of  instances>  /<input-set  p t r > , , <o u t pu t -se t  ptr> 

Lists  are  typed  as  r<object>  < ob j e c t > .  .  .  ]  .  Certain  objects, 
c a.  e  rep-object f>  ,  which  are  an  artifact  used  for  implementing 
hierarchical  list  structures  and  member  ship  in  multiple  lists, 
contain  only  pointers  and  are  typed  as  -*■  followed  by  the  pointed 
object.  I/O  sets  are  simply  lists  of  ten-word  objects  each 
containing  up  to  ten  < ad d r e s s > , , « v a  1 ue >  pairs.  Unused  words 
contain  zeros.  Thus,  an  I/O  set  containing  up  to  10  (address, 
value)  pairs  is  typed  out  as: 

address  >  ,  ,  <  v  a  1  u  e  >  /  <add  r  e  s  s  >  ,  ,  <  va  1  ue  >  /.../0  /Oh 

An  I/O  set  containing  more  than  one  10-word  chunk  is  typed 
out  as: 

<first  10  wo  r  d  s  >  *  'next  10  words>  •>  ...]. 

(Registers  0  th rough  7  are  represented  by  their  console 
addresses  177700  through  177707  respectively,  and  the  processor 
status  word  by  1  7  7  7  7  6  .  ) 

Illustration  4.3  shows  the  protocol  for  this  example.  User- 
typed  portions  are  underlined.  The  comments  in  small  type  were 
entered  later  and  are  not  a  part  of  the  protocol. 

It  inserts  the  hooks  named  HIOS  (Initialize  Output  Set)  and 
UIIS  (Initialize  Input  Set)  to  be  activated  at  node  entry  (i.e. 
hooks  of  type  NE) .  These  operations  could  have  been  done  with 
one  hook  but  this  method  permits  either  one  to  be  disabled  and 
enabled  without  effecting  the  other.  These  routines  issue  the 
I<»?()  and  IIS()  instructions  respectively.  To  build  the  I/O  sets 
during  the  node  instance,  the  hooks  named  HROS  (Build  Output  Set) 
and  HBIS  (Build  Input  Set)  are  inserted  to  be  activated  at  every 
operand  store  and  every  operand  fetch,  respectively.  These 
routines  issue  the  B0S()  and  BIS  ( )  instructions  to  maintain  their 
respective  sets.  At  node  exit,  the  input  and  output  sets  must 
be  closed.  This  is  done  by  the  hooks  named  HCIS  and  HCOS,  by 
issuing  the  CIS()  and  C0S()  instructions  respectively.  One  final 
problem  is  that  in  case  the  entire  -11  program  is  not  covered  by 
nodes,  the  hooks  HBIS  and  HBOS  must  be  turned  off  at  exit  from 
a  node  and  re-enabled  at  entry  into  a  new  one.  Thus,  DEFIO 
initially  disables  these  hooks,  and  inserts  the  hooks  named  HENB 
and  HDISB  which  are  activated  at  nodr  entry  and  exit  respectively. 
They  call  the  routines  ENAB  and  DISAb  to  perform  their  functions. 

While  this  procedure  for  building  I/O  sets  is  rather  long 
and  elaborate,  it  is  more  efficient  and  flexible  than  automatical! 


65 


maintaining  the  I/O  sets.  Further, 
and  simply  calling  it  when  reauired 


by  preparing  it  as  a  file 
it  can  be  used  easily. 


is  done*  ^cirr*110  t6StS  C*  lf  4t  is  less  than  5,  the  following 
string*'  NODE*  * f  [  *  rr--line  feed  (CRLF),  and  the  character 
lv  L1  ^ped  (,n  DAM£*  a  character  string  can  not 
no rma I  1  \  exceed  5  characters  and  it  preceded  by  a  single  quote)- 

current  node  object  (pointed  by  the  global  CPRNOBJ)  is  typed 


the 


thlsh|,TJui'"dlr!Ct  ‘nstru<:‘1»"  TIOBJ  which  also  types  a  CRl.Fi 
EJ  c«r.n  1  test  'INPl’T-SET  '(typed  1„  two  pieces). 

lnstructien'^puVR  Sh?U"  l"  Illustratl«"  3.  also  Illustrates  the 
..  .  ,  °  FLAYB.  In  this  case,  PLAYB  (177701  3),  recalling 

hat  In  the  PDP-11  177701  is  the  console  address  o  reel,  ej  1 

^e  3  lnf0r,*atl0"  ab°ut  tha  of  the  program  « 

either  as  ln"u  t' "or' "ot  t  ,""y  "  eX‘t5  "here  rCS‘8'er  1  ^-ed 


The  message  ’  - 
is  typed  out  bv  the 
encountered . 


--HALT  AT  20206'  following  the  last  I/O 
simulator  when  the  halt  instruction  is 


set 


66 


Illustration 


f 


i 


ielstlee  lii>lln« 

*ddr,  Code  Pete 


-COO "9 

'101*6 

'•'"C  "2 

1102*6 

'9001* 

' 1 "5*6 

'3"0"h 

'12700 

"00001 

'  1  o  0 1 2 

316601 

000012 

'9001b 

't;.oo  2 

000010 

'30322 

'20102 

'  "C2- 

'32*02 

*  3902b 

' 35909 

' 3"0 53 

' 30*S5 

' )00  52 

'2"20l 

1 *"0I- 

031*10 

lOOOia 

'21112 

"ooo*: 

0  9 1*  l  S 

9*0342 

OHIO) 

'090*- 

011211 

1903  1b 

010512 

'i"os; 

052700 

090001 

' 0 " 0S i 

031*05 

"OOOSb 

062701 

000002 

130062 

330*02 

00006- 

962792 

177776 

o  9  o  ?  l 

0  31 1 00 

9  0  9  0  ?  2 

9307S7 

'00374 

952700 

000001 

3301  3  ' 

001*05 

"301*2 

962702 

177776 

1501's 

9907SI 

* 

9301  1  . 

062701 

000002 

930  |  t  - 

'907*6 

lj"t  1  b 

9166*6 

00O012 

"30122 

"  1 0 1 "5 

"9012- 

062705 

177776 

'105*6 

*091 12 

90*767 

1776*2 

*9" 1 5b 

"62701 

0  0"0"2 

3  3  "  1  -*2 

310116 

9301*. 

*166*6 

00001* 

"iris: 

39*767 

17762* 

"0" 1 V . 

062706 

030006 

9  3  9  1  •> ; 

912605 

"'.'"162 

912602 

909  1*.  . 

"12601 

"0" l *b 

990207 

eubroutlne  0 SORT l 

mov 

mov 

mov 

MOV 

MOV 

MOV 

cma 

BL? 

CL» 

ll«l  C"A 
BLF 

CMP 

Btl 

MOV 

MOV 

MOV 

BIT 

BtQ 

*00 

BA 

Lilli  *00 
Lilli  CO** 
BA 

Lit  0 1  BIT 
BtQ 
*00 
BA 

itm  *oo 

BA 

Util  “OV 

MOV 
*00 
MOV 
J6A 
*00 
MOV 
MOV 
JSA 
*00 

Ll2»  mov 

“OV 
MOV 
ATS 


p ttert*  here: 


4,1 


Cc— ente 


AH  »"(SP) 

»i2,-(SP) 

»I1,-(SP» 

At  *  AtO 

12(SP),ASI 

10{SP),At2 

A|l#At2 

ISA 

*10 

III 

A|2i*ll 

Utl 

•At! , 0AS2 
LS10 

•  At l #  At J 
•At2  , • At  I 
A|J.*A»2 

•  l #  AtO 
LIU 
•2# Atl 
Ltll 
•-2, *t2 
•to 
LtA 

•  1 #  AtO 
LttS 
••2,At2 
LtA 

•2# Atl 
Lt* 

12(3*), 

•t l #  A 1 5 
••2**1 J 
»tl,-(SP> 
•C,QSOAT 
•2, Atl 
•tl, (SA) 


}■ 


eck; 


J**t  flag-l  Initially; 

!get  l.pera  Into  R1; 

I  get  2. pens  Into  *2; 

! lower  bound  (  upper  bound? 
!r<*i  to  do  sort; 

I no,  set  velue  reg.«0; 

!|o  to  exit  prologue; 

)  If  R2  *  *1  go  to  Utl; 

\  IT  loft  els.  *  right  eln. 

!  go  to  LSIO; 

otherwise 

(■eke  exchenge; 

Us  fleg  set? 
no-go  deer.  R2; 
yes  •  lr.cr.  Rl) 


} 


Icomplement  fleg; 

!go  to  beginning  of  loop; 
Us  fleg  set? 

Ino  -  go  lncr.  Rl; 
lyes  -  deer.  R2; 

!go  to  beginning  of  loop; 
Uncr.  Rl; 

go  to  beginning  of  loop; 

1  compute  bounds  for  left 
^subfile;  push  then  end 
J  recurse; 

'"compute  bounds  for  right 
/subfile;  push  them  end 


l*(SP),-(SPjl  recur se; 
*t,QS0AT 
•  6,  S*  'I 

(S*)«,AtS 
($*)♦, *t2 
($•)♦, Atl 
*C  !  return 


pop  steck  end  restore 
registers; 


Q'JlCKSOHTt 


90" l 73 

912706 

001576 

"0V 

•  SUM? 

Imove 

stsck  Halt 

to  sr 

9  J  A  l  »  „ 

9109*6 

"Ov 

•tO.-CSP) 

ipush 

contents  of 

R0 

"  3  "  1  7  s 

"191*6 

"OV 

•tl ,-<SA) 

Ipush 

contents  of 

Rl 

9;r23; 

90*767 

177*7* 

JSA 

•C , QSQAT 

lesll 

QSORT 

"092*. 

"39000 

m»lT 

tOUlCKSOATl 


*'00210 

"00212 


900212 
900««0 
"01*00 
"01 l»h 


S!GV*L I  , 

SIGAfGI  .*,*2 
.CL0BL  S t GV*L, S1GAI S 

.*stcr 

,s«00 

.*.•1000 

StTxe.-d 


The  Quicksort  Program 


Content*  of  file  DFMOl  : 

JDAME  Routine  DfWOl 

OEHQl (LOAD ( •  030HT  20000) 


Cocmients 


ft  7 


Illustration  4.2 


10*0  ( < QSORT  10000) 

IOBJfRO  0  300  00)  I  OBJ (R|  0  30)00) 
IMRfioEFIO  '•)  EX(OEFIO) 

NTR() 

C«('C)  I  OB  J ( C  0  0) 

RUNC20 170)) 

{DAME  Routine  TIO 

TJOHMC  'IT  5  (XX(CRLM  TI('NOOEI) 
TIOBJ(CuRNOBJ) 

TICINRUT)  Tf  ( ••3ET| ) 

TI0BJCCI3P) 

TI('OUTPT)  TTC-8ETI) 
TIOBJ(COSP) 

«*('♦  cct)) 


{load  QSORT  file  starting  at  20000 
{load  a  second  copy  *s  data  to  be  sorted 
I  Insert  bounds  in  registers 
Iload  DAME  file  DEFIO;  exec,  routine  DEFIO 

{create  C;  initialize  it;  default  node  definition 

{start  to  run  from  20170,  QUICKSORT 
I i f  C  <  5  then 

I  (type  msg  and  current  node-object 
I  type  curr.  input  aet 

{type  curr.  output  set 
liner.  C) 


(OISAR(HTIO) ) ) ) 


{otherwise  disable  hook 


Contents  of  file  DEFIO; 

IDAME  routine  DEFIO  -  causes  initialization,  building 
land  closing  of  input/output  sets 

oefio(hook(ine  (loan)  'hiosj 

HOOK ( 1 03  (B03 ( ) )  1MB0S) 

HOOKMNX  (C0S( ) )  'NCOS) 

H00K(INE  (1130)  *  Hi  1 3 ) 

HOOK  (  I  OF  (B1SO)  »HBI3) 

H00KONX  (C1SO)  'HCIS) 


{initialize  output  set  at 
{build  output  set  at  each 
{close  output  set  at  node 
{initialize  input  set 

{build  Input  set  at  each  fetch  operation 
{close  input  set  at  node  exit 


node  entry 
store  operation 
exit 


DISAB(HR03) 

DISAB(HfllS) 


{initially  disable  "build  i/o  set"  hooks 


H00KONE  ENAB  'HENB) 

HOOK ( • NX  01  SAB  'H0I3B)) 

CNAB(ENAB(MB0S)  ENA8(HBI3)  ) 
013aB(DI3A8(HB03)  0I3AB(HB13)) 


at  node  entry,  enable  them 

at  node  exit,  disable  them  again  -  in  case  nodes 
don't  cover  entire  program 


68 


1 1  lus  tr at  ion  4 . 3 


•Hun  dame 
DAME  I l/|0.  .. 

*  *LKH <  'DEMO |  •*) 

*  *EA<DEMOl ) 

,L3AD£D  20000  TO  20206 


FILE  LOADED  30000 


TO  30206 


•  Typed  out  by  the  LOAD  comma  mb  in  DJEM01 


NODE  tits ,  70 ,  <20200/  4,  ,  I  / 20206 7,  .sgaaii 

-/0/«.*0/i,!,  777-7"2J‘5,,0/,  777,6-^'777®‘*.'37e/n74.. 


»  #  30000/ I 3  74,,0/ i 77 
30000/ 1372# , 30100/ 

1  HpJt^ET  1  ?"  1  /23^0A>  .205502 

0^  I  362.. 0/ i 7  7707.  ^200  I  0/200 J 77,0' 36^'*0''  77 7«2» » »/ I  364. , 0/ |  77703,, 
20020.. ,1 0/1372.. 331  00/0/1/0^0/0/0 ‘  ]7700" 30000  ~ > 20  0 1  4, , 1 2/ | 374, , 30000/ 

0032/1  77700.1  »”?7  7  70uf30000/f  77  702]^30[00/00?/,364,  '0/'  3*2,,0/|  77707,,2 
lNHuM0n?m»3il/2"l/205375"m373 

0  UIHl-bE  I  It  177776**0/ !  iiil  7.1101  "30000/0/0/0/0/0/ 0/0/0  J 
'  '7776..  0/177701, ,  3 0000/ 0/ 0/ 0/ 0/ 0/0/ 0/ 0  j 

(l2Jo!2LO?tu^^0/2*'l/205306"205304 

i/T  I  77701  ..30000/30000,^  1  0  I  46/1  m&2,  ,30100/30100,,  1403/0/0/0/0/ 

OUI^T-5Ef(Cl77776,, 0/30100,., a03/0/0/0/0/0/0/0/0  , 

N3UE |20042..?00b4/i, . 1/20521 7, ,20521 5 

OUf7770fc  1  1  f07 A2^20'  ^2?!u?7?7000.'!uS%l]7703"0/,777,,2"30l0(,/3®'00*.M0 
6  /  177  700,. '/0J0/0/0 " J  77703"  131 46/30000,, | 403/301 00,, 10146/1 77707,, 2005 
---HALT  AT  20206 


l^t  us  now,  for  example,  display  the  values  of  reg¬ 
ister  1  at  the  encry  or  exit  from  the  most  recent 

■  ,  **FLAY8(  I  77701  3) 

•The  format  of  the  type-out  is: 


3  node  instances  in  whose  I/O  sets  it  appears. 


Instance 

Index 


Starting  Instr. count  Input  Output 
_ Ac  dr.  at  entry  set  Addr. 


Value  in 
I/O  set 


362»  NODE  INST.  20161000000  4107  301516301542  4  OUTPUT  VALUE  ,30100 

362,  NODE  INST .  20160000000  4107  301516301542  4  INPUT  VALUE  ,30042 

^.61  ,  NODE  1N5T.  41&1  301566301612  A  miTPUT  VA,  1)F  „fln„ 


69 


E xample  2.  Construction  of  Node  Transition  Matrix 

A  common  type  of  model  used  for  representing  control  flow 
is  the  so-called  "transition  matrix"  M  whose  element  M(i,j)  is 
the  number  of  transitions  from  node  i  to  node  j.  By  dividing 
each  element  in  this  matrix  by  the  sum  of  a  1 1  the  elements  in 
the  same  row,  one  can  obtain  a  Markovian  transition  probability 
matrix  P  whose  element  P(i,j)  is  the  probability  of  the  next 
node  to  be  executed  being  node  j  given  that  the  current  node  is 
i.  In  this  example,  I  shall  give  a  DAME  procedure  for  constructing 
the  matrix  M.  Since  we  do  not  know  the  number  of  nodes,  we  cannot 
initially  allocate  the  space  for  M.  Thus,  the  approach  we  will 
take  is  to  use  the  data  in  the  NODETRACE  table,  which  is  maintained 
by  DAME  as  the  -11  program  runs,  to  count  the  transitions  between 
nodes.  However,  this  table  is  of  fixed  size  and  when  it  is  full, 
it  is  to  be  dumped  onto  disk  and  its  pointer  NTRACEPTR  is  set  to 
zero.  This  pointer  is  initially  4,  since  the  table  initially 
contains  some  additional  information  in  the  first  four  words.  We 
shall  maintain  the  integrity  of  the  matrix  M  by  updating  M  each 
time  the  table  NODE TRACE  is  full,  prior  to  dumping  the  latter 
ontodisk. 

To  load  the  program  and  initialize  the  main  memory,  we  shall 
use  the  DEM01  routine  of  Example  1  except  that  the  TIOBJ  instruction 
for  typing  out  I/O  sets  will  be  removed. 

In  the  file  named  FLW ,  given  in  the  next  illustration,  there 
are  four  routines:  FLW,  CH  FLW ,  SFLW  and  FINDI .  FLW  is  to  be 
executed  only  when  the  table  NODETRACE  has  been  dumped  onto  disk 
for  the  first  time. 

FLW  determines  the  number  of  existing  nodes  by  taking  the 
cardinality  of  the  "node  subclass  master  list"  pointed  by  MNODESC: 

2 

stores  that  value  in  E,  sets  F=E  ,  declares  the  node  transition 
table  M  as  containing  F  words  ar.d  the  vector  NV,  which  will  contain 
the  starting  address  of  each  no<>  ,  as  containing  F  words.  It  also 
creates  the  object  G,  which  will  be  used  to  index  into  NV,  and 
sets  it  to  zero.  It  then  searches  the  node  subclass  master  list 
to  determine  the  address  of  each  node  and  fills  in  the  vector  NV . 

The  node  whose  index  in  NV  is  i  will  be  represented  by  the  column 
i  and  row  i  and  M. 

The  routine  CHFLW  is  activated  prior  to  each  dump  via  the 
hook  HNX1  inserted  at  run-time;  it  goes  through  the  table  NODETRACE 
and  passes  each  node-address  in  chronological  order  to  routine 
SFLW.  SFLW  computes  the  index  into  the  table  M  for  each  node  by 
calling  FINDI  to  get  the  index  into  the  vector  NV  of  the  node 
address  passed  to  it,  and  updates  M. 


The  success  of  this  procedure  clearly  depends  on  the 
execution  of  every  node  defined  in  program  at  least  once  until 
the  first  dump  so  that  FI.W  will  see  its  name  in  the  NODETPACE 
table  and  put  its  address  in  the  node  vector  NV.  However,  if 
some  node  does  not  appear  in  NODETPACE,  this  will  be  detected 
by  FINDI  since  that  node  will  not  be  found  in  NV ,  and  it  will 
type  the  message  '  EP ROR- IN- F INDI '  and  return  the  control  to  the 
user.  So,  this  procedure  for  constructing  M  is  not  foolproof, 
but  it  is  efficient  since  it  requires  little  monitoring  activity 
between  dumps  of  NODETRACK. 

Illustrations  4.4  and  4.5  show  the  DAME  routines  and  the 
protocol,  respectively. 


!  Routine  FLW 


FlR(C*('r)  CR('F)  CP('lND)  C  ®  C  •  I MC  >  CR(iri)  crmf2i  * 

s:::;:  c,<"‘'  ,o-  “h1"” ,o'  ‘••*r  - 

CR('L)  CR('U)  CR('I)  C  »  f  •  C  **<00C  5 


I09J(l  0  «)  I06J(U  0  3720) 
fv*L(£  CARDINAL  NN00E3C) 

RA ( ' »  F  t  £) 

C»('M  100  0  P )  CR('NV  loo  0  F) 
I  OB J ( C  0  0) 

SLST(h  NN0DE3C  (PQP(H) 


Initial  search  limits  ior  NODE TRACE 
E*t  no.  of  nodes,  E 
size  of  M,  F  -  E2 

create  M  and  NV;  (ignore  100  and  0) 


Search  List  (SLST)  works  Just  like  F1SET;  see  5.S.1 


fVAL(M  SEETl'ROUGH 
!iobj(h  h) 

BA  (  i /  h  W  1000000) 


i8- 


get  word  f\  of  node-ohj.  into  H 


I  get  left  half 


,  I0BJ(NV  C  H)  BA  (  '  ♦  CC11  PuShfOlsis  >> 

.Routine  CHFLW  ‘  ^  ^  .  .»V  ( C)  -  H;  C  -  O  I  ;  continue  search 

CmFL«(INCR(I  L  U  4  (BA(<|  J  NOOETRa  I)  .Rel  ,ef.  ....  .  .  A  , 

BA  ( t  /  j  j  1000000)  u  1  evcry  word  ot  N0Df^CE  into  J 


I  Routine  SFLW 

SEl»(POP(CNOOE) 


PUSM(J) 

EX(SFLh)))) 


Ipass  if  SFLW 


•P0P  passed  parameter  into  C.NODE 


IF(0LDN0  «f0  0  (I08J(0LDN0  0  CNCDC1  RETfPin  ,  , 

t-swutJ  RITC2)))  >lf  oldnd-0  then  (oldnd  -  cnode,  return) 

PU5M(0L0NQ)  tX(FINDI)  POP  (FI)  .n  _  .  . 

.M  *-  index  of  OLDND  in  NV 

PWSM(CNOOt)  EX(FIhOI)  P0P(F2)  J 

*  0  V?  •-  i  f  ,  i  «  nnr  ,  . 


!F2  -  index  of  CNODE  in  NV 


BAC*  INO  FI  t) 

6  A  (  •  ♦  IND  I  NO  F i) 
BA(  '  I  INC  p  ISO) 


•  compute  index  into  M 


Iget  old  count  in  M 


BA(i*  Inc  Inc  1)  I08J(“  INO  Inci 

'  ‘  *  CJ  .  increment  and  store  it  back 


ICBJIOLONO  0  C  M00t ) ) 

I  Rout  me  FINDI 

F  I  N3 I (POP (0) 

I MC»  C  J  0  t  1 


loldnd  —  cnode 


•look  ior  passed  address  in  vector  NV 


(BA(  t  |  K  NV  J) 

IP(K  'F0  Q  (PU3m{J)  R£T(J)))J) 


lif  found,  return  its  index 


T I J  •  t  R»0*  J  TI(»-In*F)  Tlf«lNOIlJ  3  TOP  ( ) ) 

u  .Otherwise  report  error  and  stop 


• HUN  DAMF 


OAMEI  |/|0.  .  . 


•  *LMK(  '  DEMO  I  *•)  LMK(  ’FLU  *•)  iLoad  DAME  filet  DfMOl  and  FLU.  The  hook  HTIO  hat 

Ibsen  removed  from  DHdOl 

..HOOKCNA  C  I F  (  nIKACEM  -eg  0  CHFLWM  'HNX|>  Ihook  HNXI  to  be  activated  later 


• «D  I 5AH  < MNX I ) 


Idltable  It 


• eNOOh ( ' N A  CIFCNIMACEM  ’EG  0$ 

***  (EACt’LW)  EX(CHFLW)  IOHJ1L  0  0>r\wlll  be  executed  after  1.  dump  only 

ENAH(HNXI)  D I 5AB (  HNX2  ) )  > )  $  )HNXI  will  be  Activated  after 
***  ’HNX2)  subsequent  duitpt 


.Having  placed  hooka  IINXI  and  F’iX2  we  can  atart  up  the  program  via  DQOI  which  toads  two  copies  of  It, 
Ito  be  used  as  data,  and  runs  It. 


••EAtULMOl  ) 

- KILE  LOADED  20000  TO  20206 

- FILL  LOAOEO  30000  10  30206 

---HALT  Al  20206 
•  *10dJ<NV  > 

2 0170/20000/20032/20036/20042/20056/ 
20026/20160/20136/201 54 


.'execution  finished 
I  type  the  node-vector  NV 
20070/20064/20074/20102/201 10/201 16' 


• *TOBJ(F ) 
4  00 


• • IOB  J( J  0  0) 


• • ]NCH( 0 

•  •• 

•  •  * 

•  •  • 

•  •• 

•  •  • 

O  I  0 
0  0  13 

0  0  0 
0  0  0 
0  0  0 
0  0  0 
0  0  25 

0  0  16 
0  0  0 
0  0  52 

0  0  26 
0  12  0 
0  0  0 
0  0  0 
0  12  0 
0  0  fc 


0  377  IS  Itype  M  aa  a  20x20  table. 
( BA (  • I  U  M  o>  S 
TYI 0( G)S 
BA ( ' ♦  J  J  | ) S 
I F ( J  *  EG  20  (AA(CRLF)S 

IOB J  0  0  >  >  ) ) ) 


O  0  0  0  0 

0  0  0  0  0 

144  0  0  0  0 

0  43  0  0  0 

0  0  25  0  I 
H  0  0  25  0 
0  0  0  0  0 

0  0  0  0  0 

0  0  0  0  0 

0  0  0  0  0 

0  0  0  0  0 

0  0  0  0  0 

0  0  0  0  0 

0  0  0  0  0 

0  0  0  0  0 

0  0  0  0  0 


0  0  0  0  0 

0  0  0  0  12  0 
0  0  0  12  0 
100  0  0  0  0 

0  0  0  0  0  0 
0  0  0  0  0  0 

0  0  0  0  0  0 

0  0  0  0  0  0 

0  52  26  0  0  0 

0  0  0  0  0  0 

0  0  0  0  0  0 

0  0  0  0  0  0 

0  0  0  0  0  12 

0  0  0  0  0  0 

0  0  0  0  0  0 

0  0  0  0  0  3 


In  the  table  typed  above,  element  (1,J)  la 
the  number  of  transitions  from  node  I  to  node  J 
where  1  and  j  are  the  Indices  of  node  addresses  In 
vector  NV. 


Itype  alee  of  matrix  M 


$  la  a  continuation  char. 

IQ  -  M(C) 

I  type  Q 
IJ  ►  J+l 

Ilf  J»20  then  (atart  new  line;  J  *■  0) 

0  0 
0  0 
0  0  0 
0  0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
7  6 

0  0 
3  C 


one 


73 


l»aa>  ?.  •%  U'.u.itf.  1.  ,»t  r ictsour 

Suppose  consid.-r  the  CUmIh*  problem: 

rv  *-  ‘VVtoV.::u -- <- 

execute1  IT'  ’  !°  11  a  v,'c'or  to  he  sorted 

f-rst  f’ara*lel  with  itself,  thns  enabling  „ 

the  n  f  ::ir  r!..can  r  qsort  in  p-ano  wlth  t 


one  idle  processor 
i  nd  invoke  it  to 
u  s  to  execute  t  h  e 


h  e 


th°  inStrUCU°,1S  ,°“ts!de  ^  m‘iin  loop  of  OS  ORT  l  "  I .  e .  .  200  32 


20114 


to  fi0rterimate  ttlC  t,la*ns0tl  (as  number  of 

1  sort  a  given  vector. 


second.  Ignoring 
20032  to 
instructions  executed) 


each 
comp 
p  r  oc 
t  i  ng 
ha  s 
numb 
and 
the 
t  i  ng 
word 


pa;\in  ;his 

a  nested  list  structure  r"  s'|VOh  1  *' 1  s  problem  by  construe 

two  elements.  ^he  first  Tm’^  U’C  Kacl’ 

tr  °f  lnst  ructions  which  bad  bee  n "e!  ut  ei "  2“ "  t  o' °  n  *' 
up  to  the  exit  frmm  ,i  ,  ,  executed  up  to  the  entry 

tr*-'-  Th*  second  <■  I  emnn  t  °o‘f  one  h '' l!  sc' i  Tt"  i  c  °  '  b  "  I  ^ 

111  1  L  s  usual  meaning  in  DAMF  .  ) 


10  lo»d  a"d  o  p  p  I  y  L  l' t  ^  1'  s  S  s  h  o  v  n  i^n  *  I  M  ij  s  t  r  a  L  i  on  f  4 ' 7  '1JI "  p|*  Ve  ',roc‘'“d 

-------  ri'pr"' 


■  e  n  typed  out. 


where  x 
nodes  . 


Ch  n°de  "  ,hl’  *hov«  the  Instruct  ions  -xtcuted 

o  the  entry  and  the  exit  from  the  node  as  x/v 
former  and  v  the  latter.  y-0  ind, cates  r  1, 


that  branch 


up 

is  the 


i-tcru%\;:n:c"„;dhne^\ichuet:rbiot!:est  ?at»s  throupj’ thp  irt,c’ 

t  ..ecu  t  ed  by  the  main-loop  portion  of 


5  36 

the  QSORT  routine. 


7  4 


Illustration  4.6 


.'Routine  QPAR 

0PAR<L0A6<<33CUT  20000)  L0AD<'Q30RT  30000)  Hoed  program 

IOBJ(Rfl  0  30000  )  10BJ<R1  0  30040)  iset  bounds  for  20  element* 

CR  < '  T  t  a*  p )  CR<<0BJA0)  C  R  < '  IC  T )  CR<'L1)  CRCL2) 

CR<'fLAC)  CR<'T£HP?) 

CR('ROOT)  evAL<R00T  -AKELIST)  !0BJ<0BJAD  0  ROOT)  Create  root  of  U.t,  put  It  In  ROOT 
Pl/3H(0)  'push  initial  amount  of  time  pa»sed"0  and  OBJAD 

PW3H(OftJA0)  !push  address  of  list  root 

Moo«<<Air  paro  20000  20000  'hparo) 

M00«<IAIF  PAP  1  200)2  200)2  IMPAR1) 

M00K<'AIF  P AR2  20116  20116  <HPAR2) 

HOOK(tAlP  PA*)  201)6  201)6  'HPAR)) 

*U  S'  <20170) ) 

'.Routine  PARO 

P*R0  <  I  OB  J  (FLAG  0  1)  .'set  flag  to  ltdlcate  new  entry 
P0P<0BJA0)  Iget  address  of  current  object 

P0P<TfMP)  !get  amount  of  time  already  elapsed  In  this  branch 

EV*L<L2  CRE0BJ  100  0  2  0)  Icreate  a  2-word  object.  Put  Its  address  In  L2 
1  AOBJ  <12  0  TE«P)  I  Insert  In  Its  word  0  time  already  elapsed 

XX<I<<ICLU0F  L2  '»  08  JA0 )  )!execute  external  -10  routine  INCLUDE  to  put  the  new  object  in 

the  current  list 

.Routine  PARI 

'E°  1  (108J(ICT  0  IC0UNT>  0  •»)>  Hi  fl.g  1.  set,  get  number  of  U- 

PA°RU2<BeA<<e  TE-P2  ICOUhT  ICtH  ‘tr'ICtl0ni  thr°'**h  thU 

B  A  (  <  ♦  T£MP2  TE**P2  TfHP)  /teraP2  ~  temp+icount  -  let 

1p«*s  It  to  PARO  or  PAR 3 


PU3M(UMP2) 

1 AOB J ( L2  1  TE«F2) 
PU3M(08JAD) 
R0S*<Tf *P2) 

IX<EN0R)) 

I  Routine  ENDR 

EhOR<EVAL<Li  makFLIST) 


Insert  total  Instructions  In  word  1  of  new  object 
pass  address  of  current  list  to  ENDR 
pass  total  Instructions  to  ENDR 


Icreate  a  new  list.  Put  Its  address  In  LI. 

i Include  the  new  list  as  a  member  of  currant  list 


EV*l<TENP  AAKERFP  Li) 

XX  < 1 NCLU0E  TEMP  OBJAD) 

PwuM(Ll)  )  'pass  address  of  new  list  to  PARO  or  PAR) 

^Routine  PAR) 

PAR)  <PCP  (CBJA0)  P0P<UMP)  PUSH  <  TEMP )  CxaNDR))  :get  value,  of  OBJAD  and  TEMP;  call  ENDR 


75 


•  nuN  UAMt 


U«*ltl  I  /  Id.  .  . 


•  •  Llsitl  *urA K  ’  •  ) 


Hoad  and  execute  QPAR 


•  »c.A<  urAix  ) 

■  ILt  LJADtO  10 

***MLE  LOADKI)  3dddW  lij  30?l)A 

- nALI  A I  2H2tf6  „  ,,  ,  .  . 

•  •  i  l  Jri  J(  rtOO I  )  . execution  finished 

•  JiJ-->l40b/42a  -»14?a/«  J-->l*24/ii  J  I  j.  -  ,  I2*a,  1  7*  -  -  >  t  \  7!,/ -  .  I  a' 

J':’i-?6/47b  1-->1SI7/S34  -->1515/1  )-->lil*/«, 

J  J  J-->  l  475/id  J  J  J--»  IJ7A/A1  3  --»14|3/H  J--»CA13/H  11J1 


The  tree  represented  by  this  list  Is  re-drawn 


II  1  ust  t  it  ion  s  .  8 
Execution  Tree  Constructed  by  QPAR 


244/405 


405/460 


244/ J 74 


40S/474 


J/4/4J4 


46°/5'J  424/T 


374/4  I  J 


4  ^75  4  I  j/o 


4  75/517 


5l7'/536 


i^ich  node  In  the  tree  shows  the  Instructions 
executed  alons  that  branch  up  to  the  entry  and  the 
e  it  fries  the  node  as  x/y,  where  x  Is  the  former 
and  y  the  latter.  y*0  Indicates  te  in  Inal  nodes. 


It  can  he  seen  that  In  the  two  lon^o  .t  paths  ^ 
throup.li  the  tree,  53b  Instructions  wo  Id  be  ex. 
e.l  by  the  main-loop  portion  ot  t!u  ( ,  tv.n  prop.i  1 


76 


-  K.f  mJP  Lt _ Flow  Btuwcfii  Two  Nod  *  s 


is  f,  ,  )  *S!?  of  analysis  tasks  for  which  DAM F  is  most  suitable 
is  the  de  ermit.it  ion  of  the  data  flow  between  two  nodes,  bv 
n  the  addresses  which  are  both  in  the  output  set  of  one 

or  b  ‘h  HPUt  Se'  °f  the  °thrr-  T1,ls  analvsis  can  be  made  in  one 
or  both  directions  of  flow.  If  the  node  instances  happen  to  be 

theSin^r°r’-  "  '  *' 1  *  pr°ced,,rt‘  will  vield  the  exact  nature  of 
1  and  e  Passed  from  one  to  the  other.  If,  on  the  other 

•  Vffre.ar‘‘  irerVCninR  n°d°  instancp*«  then  one  has  to  monitor 

ie  fdT  °n  th<*  d3ta  tl0W  b‘*tw-n  two  nodes.  For  a  more 

d  e  t  a  .  i e  d  discussion  oi  this  question,  see  Chapter  2,  Section  2.2.2. 

Representation  of  the  data  flow  from  a  node  N  to  a  node  M 
y  m  K  PUyP°Se  ,Can'  at  the  simplest  level,  be  a  list  X  oi  sets 
. k’  each  Kot  consisting  of  triples  (A.B.C)  where 

is  i'"han..;;C|dr0SS  whKb  ls  ln  the  input  set  (strict  Iv  speaking,  which 

instan  e  ddre^S  part  °f  some  member  of  the  input  set)  of  the  ith 
instance  N  .  N  ar.d  also  in  the  output  set  of  some  instance  M 


of  M  which  occurred  chronologically  between  N 
are  several  such  instances  of  M,  then  M  is  tl 


If  there 


i-1  i 

is  the  latest  one. 


the  above  triple,  B  and  C  are  the  contents  of  A  upon  entrv  into 
",  •‘■,d  u|,on  fr°°  X  .respectively,  1„  this  example,  a  DAM! 

D“0W  ani1  11,0  ass°cIat<’d  subroutines  Bfll.D  and  COPY 

lo.  beUv  •rh[,r°?rS-  T"*8C  rt,ut,"'s  a”  Siven  in  tbe  lllu.tr,- 

ti  0.1  below  The  -n  program  on  which  we  operate  is  again  the 

n,‘  ^miliar  OSOPi.  For  the  purposes  of  this  example,  we  si  all 
"oolo  n  dataflow  from  node  NA  to  node  NB ,  extending  from 

;  °0C  r(  ,e;  relatlve  address  0)  to  20024  and  from  20042  to  2007  ^ 

t . pec t ively .  aA  is  the  first  node  in  QSORT.  It  saves  the  conti.nts 

ir(,r'J1!,!erS  H1,  R2  and  K3  on  thp  stack,  gets  the  parameters  (whivl 

R\  and  F2  ‘tr  \  ',PPC‘r  b0U°dS  °f  L  he  Vector  to  bp  sorted)  into 

and  if  checks  to  see  if  lower  bound  is  less  than  upper  bound 

and,  i,  so,  branches  to  the  main  portion  which  does  the  sorting. 

Se  inter"?  *  s  entered  when  two  elements  which  have  to 

;  n  ^changed  have  been  found.  These  elements  are  pointed  bv 

,“CS  Wll^h  end  of  the  vector  is  to  be  advanced  in  accordance 
,  ,,  e  qu  us‘  rt  3  1  go  r  i  t  lim  ,  makes  the  advancement,  complements 

the  flags  and  branches  back  to  the  beginning  of  the  loop.  We 
note  that  while  NA  and  NB  are  not  consecutive,  the  intervening 

-"cep;'!'  °anndi?cl0Cr,°n?12K032  t0  20040  d°  n0t  m°dlf>-  —  locations 
Pt  IS  ana  IC.  As  will  be  seen,  neither  of  these  appear  in 

the  input  set  of  NB. 


Ilio  DAM1  routines  given  litre  are  quite  general  and  could 
be  applied  to  any  two  nodes  bv  changing  the  linits  of  the  nodes 
in  the  NODE  and  HOOK  instructions  in  DKLOW  and  the  KOSKT  instruc¬ 
tion  in  COI’Y  . 

1o  explain  brief./  the  functioning  of  the  procedure,  the  main 
routine  DKLOW  does  the  following: 

( i )  Creates  some  objects  which  will  be  used  later;  of  these, 

M  A  1  N  L  will  point  to  the  list  we  are  interested  in, 

(ii)  Creates  a  list  and  stores  its  address  in  MAINE, 

( i  i  i )  Loads  two  copies  of  Q S 0 R T  ,  one  of  which  will  be  used 
as  data, 

(iv)  Defines  nodes  NA  and  N'B  , 

(v)  Loads  and  executes  the  DAME  routine  D1F10  (this  routine, 
as  will  be  recalled  from  an  earlier  example,  imply  places  hooks 
at  node  entry  exit  points  to  build  i  nput /ou  t  put  sets), 

( v  i  )  Inserts  a  hook  to  execute  routine  B  l’  I L  D  after  each 
execution  of  t  lie  instruction  at  location  2  007  2  ,  i.e.  alter  NB, 

( i  i  )  Initializes  registers  R 0  and  PI  to  contain  the  initial 
bounds  o l  the  vector  to  be  sorted  ( wb i c h  is  what  tic  main  progra- 
QUICKSORT  expects), 

(viii)  Starts  the  execution  from  location  20170,  the  starting 
address  o  i  Ql’ICKSORT. 

The  routine  BUILD  is  thus  activated  after  each  instance  of 
N  K  and  does  the  following: 

( i  )  Creates  a  list,  pointed  by  1 , , 

( i  i  )  Searches  the  current  input  set,  pointed  by  C I S  P ,  for 
addresses  which  also  occur  in  the  output  sets  of  the  instances 
of  NA  and  makes  an  entry  in  the  list  pointed  bv  I  for  each  such 
address  . 

(It  will  be  recalled  that  an  input  set  is  a  list  of  one  or 
more  t  e n - w o r d  objects,  each  word  of  w h i c 1  contains  an  address  in 
the  left  half  and  the  contents  of  that  address  in  the  right  half. 

A  7 i  r o  indicates  the  end  ol  the  list.  The  address  0  is  represented 
bv  7777,7.  Cl  contains  tie  address  of  current  word  to  be  looked 
at.  The  instruction  Insert  Indirect,  II0B.J(C2  Cl),  inserts  in 
C2  -he  contents  of  the  word  pointed  by  Cl.  The  COPY  routine  extract 


78 


the  address  part  of  the  word  and  uses  the  FOSET  and  FVAL  instruc¬ 
tions  to  find  the  most  recent  instance  of  the  node  NA  in  whose 
output  set  the  address  appears.  If  one  is  found,  C6  will  contain 
the  contents  of  that  address  in  the  output  set  found;  otherwise 
it  will  contain  -1.  (See  the  description  of  the  FOSET  instruction 
in  Chapter  3,  Section  3.6.2,  for  a  better  understanding  of  how 
this  works.)  If  such  an  output  set  has  been  found,  an  object  of 
2  user  words  is  created,  pointed  by  C7.  The  contents  of  the  word 
in  the  input  set,  and  t  h  .*  value  just  found  in  the  output  set  art- 
inserted  in  it  via  the  Insert  Addressed  (IAOBJ  instructions. 

The  object  is  put  in  the  list  pointed  by  L,  and  COPY  returns  to 
Bl'ILD.  BUILD  continues  with  the  search  until  the  current  input 
set  is  exhausted.  At  the  end  of  the  SLST  instruction,  L  points 
to  a  list  of  2-word  objects  whose  first  word  contains  the  data 
described  above.) 

( i i i y  Puts  L  in  the  main  list  pointed  by  MAINL  and  types 
ouf  the  list  pointed  by  L. 


79 


Illustration  4.9 


.'routine  P  F  LOW 

I)F  LOW  (  CH  (  '  L )  CR  (  '  C)  CR('Cl)  CR('C2) 

C  R  (  '  C  3  )  CR('C4)  C  R  (  1  C  5  )  CR('C6)  CR('C7) 

CR('Ll)  CR('MAINL) 

.'create  a  list;  put  its  address  In  MAINE 
FVAL(MAINL  MAKKLI  ST) 

.'load  two  copies  of  OSORT 

L0AD( 'OSORT  20000)  IOAD('OFORT  30000) 

.'define  the  nodes  N  A  and  N  B 

M  0  D  F  (  '  N  A  20000  20024)  N  0  D  K  (  '  N'  B  20042  20072  ) 

.load  monitor  routine  OFF IO  and  execute  it 
I.  M  R  (  '  D  F  F I  0  '*)  EX(DFFIO) 

.'insert  hook  to  build  desired  lists 
HOOK (  '  A  I  C  BUILD  2007  2  200  7  2 
.'initialize  R0  and  PI,  and  po.  .  . 

I  0B  J  (  P0  0  30000)  1 0  B  J  (  R  I  0  30040)  RUN’ (20170)) 

.'routine  BUILD 

III'  I  I.D  (  FVA  I  ( I.  M  A  K  F  L  I  S  T )  .'create  a  list  for  this  instance;  point  I  to  it 
.'search  current  Input  list  pointed  by  CISP 

SLST((  CISP  (POP  (Cl)  .'pet  addr.  of  first  10  - word  object 
.'search  each  word 

I N  C  R  ( 1  0  11  1  (IIOBJ(C2  Cl)  .'pet  contents  into  C2 

I F  ( C  2  'EO  0  (RET(2)))  .'if  0,  end  search 
BA('+  Cl  Cl  1)  .'else,  incr.  Cl 
EX  (COPY)  ))  .'call  COPY 
PUS  11(0)))  Icontinue  search 


X  X  (  I N  C  LID  1.  L  '.  MAINL)  .'put  list  pointer  into  main-list 
TIOBJ(L))  .'type  list  for  th's  instance 

Irourire  COPY 

C  0  P  Y  (  b  A  (  '  /  C  3  C  2  1000000)  .'pet  left  half  into  C3 

ireh  output  sets  of  node  N A  startinp  with  most  recent  instance 
H  T(C4  20000  (P0P(C5)  .'pet  address  of  first  object 

I  V  A  L  ( C  6  C3  05)  ! get  value  of  address  which 

!is  in  C 3  from  the  object 
! poi nted  by  C5  into  C6 
!if  C 6  '0,  continue  search  else  quit 
I  F  ( C  6  'LT  0  (  PL’S  II  ( 0 )  )  (PUSH  ( 1 ) )  )  )  ) 


•'if  search  failed,  exit  routine 
IF(CA  'Ll  0  ( P  F  T  (  2  )  )  ) 

.'otherwise,  create  a  2-word  object;  point  C7  to  it 
F  V  A  I.  (  C  7  CPFOEJ  1  00  0  2  0) 

.'  insert  contents  of  CP.  into  word  0  of  new  object 
I  A  0  B  J  (  C  7  0  C  2  ) 

' insert  contents  of  C6  into  word  1  of  new  object 


I  A 0 B  J  (  C 7  1  C6  ) 

.'put  new  object  in  the  list  for  the  current  instance 
XX (INCLUDE  '.  Cl  '.  L)) 


80 


1 1  Just  ration  4.10 


.  I  on 

JOB  14  CMl'lOA  7. 04 /DEC  5.04B  TTYlj 
PC 4 1  OB  AO  7 
PASSWORD : 

1  24  7  1  4-JUN-7  3  THUP. 

TH  1 200  ...  NEWS (6  -  14) 

.  RL'N  NDAMF 

DAME  11/10.  .  . 


**LM.R  (  '  DFI.OW  '*)  EX(DFLOW) 

- FILE  LOADED  20000  TO  20206 

- FILE  LOADED  30000  TO  30206 

1  7 7  70 1  ,  , 30000  / 30000  30000,  01  46/  1  01  46  1  7  7703,  ,0/0  1  7 7 7 0 2  ,  ,  )  00 4 0 /  3 004 0 

30040,  ,  34  1  5  /  34  1  5  1  7 7  7 0 7  ,  ,  2 005  2  /  2 00  5 2  20052  , ,  1  /  1  1  7  7  700  , ,  1  /  1  20060  ,  ,  2  /2 

1  7701  , ,30002/  30002  3 000 2 , , 1  0 24 6  /  1  0 2 4 6  1  7 7 703  ,  ,  1  01 46/ 1  01  4 6  1  77  702 
300^0/  30040  30040 ,. 1  01 46  /  10146  1  7 7  7 0 7  ,  , ° 00 5 2  /  2 0 05 2  ,  ,  1  /  1  1  7  7  700  1  7  7  7  76 

/I  7  7  7  76  2006  6  ,  ,  1  7  7  7  7  6  /  1  7  7  7  76 

177701..  30002/30002  30002, ,10146/10146  177703, ,10246/10246  177702 
300034  /  300  34  3 00 3 4  ,  ,  3 4  3 0 /  3 4  3 0  1  7 7  7 0 7  , ,  2 0 05 2  / 2 00 5  2  2005  2  ,  ,  1  /  1  1  77  700  * 

20060 , , 2/2 

177701..  30004  '30004  30004, ,10346/10346  177703,, 10146/10146  177702 

30034/3003  4  30034,, 10146/10146  177707,, 20052/20052  20052  1/1  177700 

1  7  7  7  76  /  1  7  7  7  7  6  2 00 6 6  ,,  1  7  7  7  7 6  /  1  7  7  7  7 6 

1  7  7  7  01  ,, 30004/30004  30004, ,10146/10146  177703  ,  ,10346/10346  177702 
30030/30030  3 00 3 0 , , 4 5 3 / 4 5 3  1 7 7 7 0 7 , , 2 0 0 5 2 / 2 00 5 2  20052, ,1/1  177700  1/1 
20C60 , ,2/2  ' 

1  77701  ,.  30006/  30O06  3O()06  ,  ,  1  2  7  00  /  1  2  7  00  1  7  7  7  0  3  ,  ,  1  0 1  4  6  /  1  0 1  4  6  1  7  7702 
30030/  30030  3 00 3 0 ,  , 1  0 1  4 6  /  1  0 1  4 6  1  7 7  7 0 7 ,  ,  2 0 0 5 2  / 2 00 5 2  20052, ,1  /1  1  77  700 
1  7  7  7  76  /  1  7  7  7  7  6  2 006 6  ,  ,  1  7  7  7  7 6  /  1  7 7  7  7 6  1 

177701 . .  30006/  30006  30006, ,10146/10146  177703, ,1  2700/12700  177702 

30026/3007.6  30026  ,,  5000/  5000  1  7  7  7  0  7  ,  ,  2  00  5  2  /  2  00  5  2  20052  1  /  1  1  77700* 

1  /  1  20060,  ,2/2 

'  C 


81 


t \ am p 1 e  5 ,  Analysis  of  Inst  ruction  Mix  and 

Addressing  Mode  Esa~ge~~by  PDP-11  Programs 


Itis  example  is  based  on  an  experiment  in  which  we  were 
interested  in  comparing  the  performances  of  the  PDP-11/20,  /40 
and  45  in  connection  with  a  proposal  for  the  acquisition  of 
\  oral  processors  for  the  Carnegie  Multi-miniprocessor  ( C  .  mm  p  ) 
What  we  wanted  was  a  rough  estimate  of  the  relative  speeds  with 
which  these  processors  would  execute  programs  typical  of  the 
workload  to  be  placed  on  them  here.  The  procedure  followed  was 
as  folio w s  : 


(i)  Four  available  programs  were  selected  as  benchmark 
programs:  Two  hand-coded  in  assembly  language,  and  two  BLISS/1 J 

programs.  The  assembly  language  programs  were  an  interactive 
disassembler  for  the  PDP-11  written  by  Roy  Levin  and  the  "vector 
node  portion  of  the  XCP  (Xerox  Graphic  Pointer)  support  propram 
written  by  George  Robertson  and  Hal  Van  Zoeren.  The  two  BLISS/  1  i 
programs  were  an  interactive  PDP-11  debugging  aid  written  bv 
Pov  Levin  and  the  Quicksort  program  used  in  the  preceding  examples. 
T  h  t  .->  e  programs  were  judged  to  be  a  o  o  d  cross-section  of  t  h  e 
:o  rk i ond  1 1  be  run  on  the  C.mmp,  excluding  the  numb e r -c r u n c h i np 
programs . 


(ii)  The  information  required  to 
of  the  models  /40  and  / 4  5  were  derived 
m  a  n  u  a  1  s  , 


project  tiie  performances 
from  the  respective  processor 


(iii)  A  D  A  v  E  routine  ( I M  j  x )  was  written  to  monitor  the 
execution  of  these  four  programs  and  gather  the  required  data, 

(iv)  A  DA  Ml.  routine  (HP  OPT)  was  written  to  summarize  and 
report  the  collected  data  in  the  form  of  instruction  mix,  addressing 
mode  usage  and  brandling  statistics, 

(v)  Two  BLISS/10  prograns  were  written  to  calculate  the 
performances  of  each  of  the  /40  and  / 4  5  (These  were  needed 
because  of  the  wide  dissimilarity  in  the  t  o r m s  of  the  processor 
specifications  given  in  the  manuals  for  the  two  machines.)  These 
programs  were  written  in  BIISS  rather  than  DAME  because  of  the 
relatively  large  amount  of  arithmetic,  table-look  up  etc.  that 
was  required.  This  fact  also  turned  cut  to  be  a  good  test  of  the 
case  with  which  data  could  be  communicated  between  DAMF  and  BIISS, 
which  was  found  to  be  very  easy  and  natural. 

(vii)  The  DAM!  routines  and  the  BLISS  models  of  the  /  4  0  and 
-•5  were  debugged  and  hand -checked  over  short  sequences  of  -11 

code  , 


82 


(viii)  Several  runs  of  varying  lengths  were  made  with  each 
of  the  four  benchmark  programs  with  different  inputs.  Tie  collected 
data  was  incorporated  into  a  memorandum  and  sent  to  various  faculty 
and  staff  members  connected  with  the  C.mmp  project. 


In  this  example,  1  shall  go  over  the  IM1X  program.  As  men¬ 
tioned  above,  the  function  of  this  routine  was  to  build  various 
tables  and  accumulate  counts  during  execution.  Below  is  a  list 
of  these  data  items  (all  integers  below  are  decimal;  in  the  listing 
of  the  I  h  1 X  routine  itself  in  the  next  illustration,  in  octal'': 


DOTAB:  a  12x8  table  containing  a  count  of  each  of  the  twelve 

double-operand  instructions  broken  down  bv  the  eight  desti¬ 
nation  modes, 


S  0  T  A  B  :  a  2  6x8  table 
similar  to  DOTAB, 

TOT  1  COUNT:  a  vector 

(indexed  bv  0 P N -  see 

LOS  NO:  a  12x8  table 
source  operand  mode 


for  single-operand 

containing  a  count 
below)  , 

for  ioub 1  e-operand 
0,  broken  dow’n  by 


instructions,  format 


for  each  op-code 


instructions  whose 
destination  mode. 


DOS0:  a  12x8  count  table  for  double-operand  instructions 

whose  source  operand  mode=  0, 


TOTS  MOD:  a  12x8  count  table  for  double-operand  instructions 
bv  source  mode, 


JSRCO:  a  count  vector  for  JSR  instructions  by  dst.  mode, 

DSR7:  a  count  of  instructions  whose  destination  operand  is 

register  7  (PC), 

JMPR7:  a  count  of  J M P  instructions  whose  destination  operand 

is  register  7, 


TOTL'O  : 
T0TS0  : 


total  number  of 
total  numbe  r  of 


double-operand 

single-operand 


instructions, 

instructions, 


TOTCC.OC:  total  number  of  condition 
TQTBP:  total  number  of  conditional 
SUCCBR:  total  number  of  successful 


code  operators, 
branch  instructions, 
conditional  branches, 


BRPD 


total  distance  covered  by  positive  branches. 


83 


PBCNT:  total  number  of  positive  branches, 

BKND:  total  dJ  stance  covered  by  negative  branches, 

NBCNT:  total  number  of  negative  branches, 

I'NSUCCB:  total  number  of  unsuccessful  conditional  branches 

In  performing  these  calculations,  IMIX  uses  a  number  of  data 
items  supplied  by  the  simulator.  These  are  (all  items  refer  to 
the  current  -11  instruction): 

OPN:  a  unique  integer  representing  the  op-code, 

(Note:  the  op-code  itself  is  not  suitable  for  this  purpose) 

SRCMODE:  source  operand  mode, 

DSTMODE:  destination  operand  mode, 

DSTREG:  destination  register, 

OPC:  character  representation  of  mnemonic  op-code, 

OLDPC:  last  value  of  PC, 

The  IMIX  routine  itself  is  given  in  the  next  illustration. 
The  protocol  and  the  results  of  the  analysis  are  not  given  here 
because  that  would  require  the  inclusion  of  the  RPORT  routine 
as  well  and  possibly  also  the  BLISS  routines  for  projecting  the 
performances  of  the  /  4  0  and  /  4  5  .  I  do  not  consider  the  actual 
results  of  that  analysis  as  important  for  this  thesis  as  the 
description  of  the  methodology. 


84 


Illustration  4.11 


IMIX(CR('C1  300  0  1)  CR(’TOT) 

CR(  D0TAB  100  0  14°)  .'will  contain  d.o.  instr.  counts  bv  dst.  node 
0R(  SMPCT  100  0  140)  .'table  for  src.  mode  percentages 


CP ( ' S0PCT  100  0  320)  Itahle  for  single  opd.  instr.  percentages 

CRTTOIDO)  CR(’TOTSO)  CR('TOTBR)  CR('TOTMS) 

CRCJSRCO  100  0  10) 

CR('BRPD)  CR('BRND) 

CP('PBCNT)  CR('NBCNT) 

CR('Tl)  CR('T2)  CR('T3)  rp('T4) 


linsert  hook  to  execute  MIX  after  everv 
HOOKCIC  Mix  '  HMIX)  )  y 


instruction 


MIX^BA('+  T1  Jr!)01'  °PN^  *  increment  TOT  I  COl’NT  [  OPN  ] 
IOBJ (TOTICOU  OPN  Tl) 


.°PN  ’  CaU  f°r  aPProPriate  action 
“  13  INDO  .’if  double  operand,  call  INDO 
(IP  ( ° p N  LE  45  INSO  .'if  single  operand,  call  INSO 

(IF(OPN  LE  57  INCOP  !if  cond.  code  opr.,  call  INCOP 
•it  conditional  branch  then  call  INCBR, 

.'else  incr.  "misc.  instruction"  count 
( I F ( OPN  'LE  77  INCBR  (BA('+  TOTMS  TOTMS  1) 

•'if  JSR,  call  INJSR 
I F ( O  P  N  'EO  100  INJSR)))))))))) 


.'double-operand  instruction  handler 
INDO  ( BA  (  '  *  T  3  OPN  10)  .’compute  index  into 
BA  (  '+  72  T 3  SPCMODF) 

BA( '  I  Tl  TOTSMOD  T2) 


TOTSMOD  table, 


incr. 


B  A  (  +  Tl  Tl  1) 

IOBJ (TOTSMOD  T2  Tl ) 

.incr.  DSR7  if  required 
I F (DSTMODE  ' EQ  0  ( I F ( DS  T  R  EC  ' 
.'increment  count  according  to 
I F ( S  RCMOPF  'GT  0  INCG0  INCE0) 
•incr.  total  d.o.  count 


EO  7  ( BA (  '  +  DS R 7  DSR7  1))))) 
whether  srcmode  is  0  or  not 


B A ( ' +  TOTDO  TOTDO  1)) 


table  entry 


I N  C  C  0 ( B  A ( ' +  T  4  T  3  DSTMODE)  liner. 
BAf':  Tl  DOS  NO  T4 ) 

BA( '+  Tl  Tl  1) 

IOBJ ( DO S N 0  T 4  Tl)) 


count  for  d.o.  instr.  with  srcnode^O 


(continued  on  next  page) 


85 


(Illustration  4.11 

(continued) 


INCE0(BA('+  T  4  T3  DSTMODE)  liner,  count  for  d.o.  instr.  with  srcmode  =  0 

B  A ( ’  !  T 1  DOSO  T4 ) 

B  A  (  '  +  T 1  T 1  1) 

IOB J (DOSO  T4  Tl)) 

Isingl e-operand  instruction  handler 
INSO(BA('-  T3  OPN  14)  .'compute  index  into  SOTAB 
B A ( ' *  T3  T3  10) 

BA ( ' +  T4  T3  DSTMODE) 

BA ( ' !  Tl  SOTAB  T4) 

B A ( '+  Tl  Tl  1) 

IOBJ(SOTAB  T4  Tl)  liner.  SOTAB  entry  and  store  it  back 
BA('+  TOTSO  T0TS0  1)  liner,  total  s.o.  count 
.'increment  DS7  and  JMPR7  if  required 
I F (  DSTMODE  ' EQ  0(IF(DSTPEO  ' FO  7 

(B/ ( ' +  DSR7  DSP7  1) 

T’(0PC.  'FQ  '  JMP  ( BA  (  '  +  JMPR7  JMPR7  1)))))))) 

! increment  total  cond.  code  operator  count 
INCOP (BA ('+  TOTCCOC  TOTCCOC  1)) 

! increment  total  branch  count,  take  care  of  successful  and  unsucc.  branche 
INCBR(BA('+  TOTBR  TOTBR  1) 

B A ( ' -  Tl  PC  OLDPC) 

I F (Tl  ' NEQ  2  INCSB  INCUB)) 

! successful  branch 
INCSB(BA('+  SUCC3R  SUCCBR  1) 

.'accumulate  positive(forward)  and  n  ega  t  . -'e  (backward )  branch 
.'distances  and  counts 

I F ( Tl  'GT  0  ( B A ( ' +  BRPD  BRPD  Tl)  BA('  +  PBCNT  PBCNT  1)) 

( B A ( ' +  BPND  BRND  Tl)  (BA('+  NBCNT  NBCNT  1)))) 


! unsucc .  branch 

INCUB(BA( ' +  UNSUCCB  UNSUCCB  1)) 

.'increment  JSR 

I N  J  S  R  (  B  A  (  '  .'  Tl  JSRCO  DSTMODE)  ,'incr.  JSRCOUNT-by-DSTMODE 
B A ( ' +  Tl  Tl  1) 

IOB J (JSRCO  DSTMODE  Tl ) 

I F (DSTMODE  ' EQ  0 
liner.  JSRR7  if  required 
(IF (DSTREG  'EQ  7 

( B A ( ' +  JSRR7  JSRR7  1) )  )  )  )  ) 


86 


CHAPTER  5 

A  PERFORMANCE  MODEL  FOR  DAME-LIKE  SYSTEMS 


Having  given  a  description  of  the  design  of  the  DAME  system 
and  illustrative  examples  of  its  application  in  various  types 
of  analysis  tasks,  it  is  now  worthwhile  to  consider  the  resource 
requirements  and  performance  of  DAME-like  systems.  It  should 
be  clear  by  now  to  those  who  have  examined  Chapters  3  and  4  ,  that 
such  systems  are  very  costly  in  terms  of  main  storage  and  CPU 
time.  Thus  in  this  chapter,  I  would  like  to  construct  a  model 
of  the  operation  of  DAME-like  systems  and  parameterize  the  resource 
requirements  of  each  major  component  of  that  model.  To  do  this, 

I  shall  proceed  as  follows:  First,  I  shall  give  an  informal  and 

intuitive  definition  of  what  I  mean  by  "DAME-like"  systems 
(Section  5.1).  Then,  I  shall  construct  a  more  structured  and 
concise  model  of  such  systems,  exhibiting  the  overall  control  flow 
structure  and  the  main  "cost  centers"  ignoring  the  costs  incurred 
by  any  hooks,  i.e.  involving  only  the  object  machine  simulator 
and  checks  for  hooks  (Section  5.2).  This  will  be  followed  by  a 
characterization  of  the  overhead  of  two  major  types  of  monitoring 
operations  which  are  essential  to  our  approach;  namelv,  the 
monitoring  of  node  entry  and  exits  (including  the  maintenance  of 
the  node  trace  table)  and  the  construction  of  input/output  sets 
(Section  5.3).  These  operations,  while  they  are  implemented  in 
the  DAME  system  bv  the  insertion  of  hooks  by  the  system  itself 
just  as  a  user  would  insert  hooks,  should  be  regarded  as  integral 
parts  of  tiie  analysis  facility  and  hence,  their  performance  is 
considered  a  significant  part  of  the  basic  performance  of  such  a 
facility.  Thus,  at  t  he  end  of  Section  5.3,  we  will  have  constructed 
a  rough  theoretical  model  of  the  object  machine  simulator  including 
the  checks  for  every  type  of  hook  defined  in  the  DAME  system 
and  we  shall  have  superimposed  on  this  model,  a  model  of  the  over¬ 
head  of  the  execution  trace  facility,  i.e.  the  node  and  input/ 
output  set  mechanisms.  This  will  provide  a  picture  of  the  overall 
operating  overhead  of  such  a  facility  excluding  any  user  hooks. 

The  amount  of  everhead  introduced  by  user  hooks  is,  of  course, 
a  function  of  the  actions  performed  by  the  specific  hooks  and 
therefore  cannot  be  modelled  in  general. 

Finally,  in  Section  5.4,  some  measurements  of  the  PDP-11 
simulator,  the  node  entry/exit  overhead  and  the  input/output 
set  overhead  are  given. 


87 


5  .  An  . . 1  Characterization  of  DAME-1 ihe  Systems. 

He  shall  call  a  system  "DAME-1  lice"  If  1»  P em 

is  the  monitoring  and  dynamic  “  J'  b  (i)  permitting 

analysis)  of  the  behaviour  of  the  object  system  ;  P 

the  user  to  define  a  structure  over  •>»  ‘  t  s  of 

collecting  execution  history  data  cki  t0  any  point  in 

that  structure  in  such  a  way  tha  (-Mi)  permitting  the  user 

i-tr“norntc^:^:tioie:nds^pt^ar^oLntr<node,d 

here^is^that^they*  op  erate°onb  single -stream  ^sequential  ^processors 

where  the  system  state  resul  mg  .  ,  tate  of  the  pro¬ 

can  be  completely  determined  from  t  e  ^n^am  ,  that  DAME-like 
cessor  and  the  inputs.  This  mean  ,  behaviour  of  programs 

:irherhaera:yn0dtepIenienreitoendth°e  ^k^cS^ 

rSiite^r  - 

difficult  toaaccomplSishnwith  DAME-like  systems. 

5.2  A  Model  of  DAME-like  Systems 

7he  basic  operational  cycle  of  °  “‘f  ^The'aSdlt  ion 

of  the  instruction  cycle  of  the  J  e  c  t  inac^ne  w  1 1  h  t ' he 

“iF;L:H  o7indCe«bi^inr:i^Kp7. 

accesses.  Arbitrary  ieve  _„„co  nf  keeping  the  ex- 

mltted.  On  the  other  hand  *«  '  e  (1  e  no  block  transfers, 

position  simple,  only  si\g  ...  ,  mnsidered.  Even  with 

half  word  or  byte-addressing)  will  be  considered  ^ ^ .  accurate 

this  restriction,  it  }S 0)m  t he’ ins  t  r uc  t  i on  decoding  process  which 
and  constructive  model  of  the  satisfying  this  restric- 

will  describe  all  conve  i  vabl  e  pro  c  ■  “f  \  ^f0wing  Kind 

tion.  In  the  model  given  bel  ,  instruction  is  fetched: 

of  an  instruction  decoding  process.  The  determined; 

its  opcode  is  determined;  the  numbe  r  ° f  e  a  c  h  operand  is 

the  aSdress  of  each  operand  s  Tt'.  access-type 

fetched  or  stored,  one  at  a  tim  ,  tore  operation  is  usually 

as  determined  from  the  instruction ;  each  o  the 

preceded  by  a  computation  of  the  value  to  be  stored 
operands  which  have  been  fetched  so  far. 


88 


We  can  break  down  the  total  cost,  C,  of  the  simulateu 
execution  of  an  object  machine  instruction  into  3  parts: 

1-  The  basic  cost,  C  ,  of  indexing  into  the  object  machine 

B 

memory  to  get  the  instruction,  executing  the  instruction  and 
checking  for  interrupts, 

2-  The  cost  of  scheduling  memory  access  events  and  updating 
the  clock  (C  ), 

S 

3-  The  cost  of  checking  for  hooks  at  each  contact  point  ( (J  ). 

II 

Clearly,  these  cost  components  are  not  incurred  in  lumps, 
but  rather  they  are  interleaved  throughout  the  execution  of  each 
instruction.  C  depends  on  the  semantics  of  each  instruction  and 
B 

how  easily  it  can  be  emulated  on  the  host  machine. 

C  is  a  direct  function  of  the  number  of  events  to  be  scheduled. 
S 

In  a  memory-cycle  level  simulator,  for  an  instruction  involving  n 
operands  in  the  main  memory,  C  =(n+2)T  where  T  -the  cost  of 

S  S  S 

scheduling  an  event  and  activating  it.  The  two  events  in  addition 
to  the  n  memory  accesses  for  operands,  are  for  simulating  the 
delays  for  fetching  the  instruction  and  performing  the  operation. 

C  is  a  direct  function  of  the  total  number  of  operands 

fetched  or  stored,  including  side-effects,  by  the  instruction. 

It  involves  two  kinds  of  overhead:  checking  for  general  hooks 
and  checking  for  addressed  hooks.  Thus,  for  an  instruction  invol¬ 
ving  a  total  of  m  operands,  C  =(m+2)(T  +T  ),  where  T  =  overhead 

H  GH  AH  GH 

of  checking  for  general  hooks,  T  =  overhead  of  checking  for 

AH 

addressed  hooks,  and  the  two  additional  checks  are  for  checks 
for  instruction  fetch  and  instruction  completion  hooks. 

Thus  for  a  simulator,  which  has  been  written  in  a  "loose" 
way  so  that  inserting  checks  for  hooks  will  not  cause  much  per¬ 
turbation,  if  the  average  number  of  operands  of  an  instruction 
which  are  located  in  the  main  memory  is  n,  then  the  ratio  R  = 

.  1 
(simulation  time/real  time)  with  no  checking  for  hooks  will  be 

P.  =  ( C  -b(n+2)T  )  / T 
1  B  Sr 


89 


where  T  is  the  average  time  to  execute  the  same  kind  of 
r 

instruction  (i.e.  involving  n  main  memory  accesses)  on  the  real 
object  machine.  If  we  add  to  this  the  overhead  for  checking  for 
hooks  with  an  average  number,  m,  of  total  operands  per  instruction 

we  get 

R-(C  +(n+2)T  +(m+2) (T  +T  ))/T 
B  S  OH  AH  r 

which  is  a  broad-gauge,  general  model  of  the  performance  of  a 
DA.ME-like  system  with  no  hooks  attached.  If  the  object  machine 
simulator  has  been  implemented  at  the  instruction  level,  rather 
than  at  memory  cycle  level,  then  the  associated  overhead  can  be 
found  by  setting  n=0.  Further,  if  hooks  can  only  be  inserted  at 
instruction  fetch/completion  level,  rather  than  operand  fetch/store 
level,  the  corresponding  overhead  car  be  found  by  setting  m=0. 

5.3  The  Overhead  of  the  Node  Mechanism 

The  overhead  introduced  by  the  Node  Mechanism  can  be  consi¬ 
dered  in  two  parts:  (i)  the  overhead  due  to  checking  for  entry 

and  exits  f'om  nodes,  and  (ii)  the  overhead  for  the  construction 
of  input/output  sets.  Let  us  consider  these  two  components  in 
turn. 


5.3.1  The  Overhead  of 


Node  Entry  and  Exits 


Let  us  first  consider  the  case  where  nested  nodes  are  not 
permitted.  In  this  case  the  procedures  for  detecting  node  entry 
and  node  exit,  which  I  shall  denote  by  ENTRYP  and  EXITP  respectively, 
can  help  each  other  significantly  by  communicating  to  each  other 
information  as  to  whether  an  entry  or  exit  has  been  performed. 

Since  nested  nodes  are  not  permitted,  each  node  entry  must  e 
followed  by  a  node  exit  before  another  node  entry  can  occur. 
Similarly,  every  node  exit  must  be  followed  by  a  node  entry  before 
another  node  exit  can  occur.  Further,  since  we  do  not  assume  that 
the  defined  nodes  cover  the  entire  program,  there  will  be  times 
when  the  control  flow  will  not  be  inside  any  node.  Hence,  after 
EXITP  tells  ENTPYP  that  the  last  node  has  been  exited  and  therefore 
that  a  new  node  may  begin  anytime,  ENTRYP  must  check  with  each 
subsequent  instruction  fetch  to  see  if  a  new  node  is  being  entere  . 
The  cost  of  this  check  will  depend  strongly  on  its  implementation. 
For  example,  if  there  are  two  additional  bits  in  the  representation 
of  the  object  machine  available  for  this  use,  these  can  be  used 
to  indicate  the  first  and  the  last  instructions  of  a  node.  Otherwise 
a  list  of  node  definitions  can  be  searched;  or  alternately,  as 
in  DA^E,  each  used  memory  location  can  be  assigned  an  attribute 
list"  and  a  node  descriptor  can  be  put  on  the  attribute  list  of 


proceeds Casgf ol 1 reSS  the  n0de'  In  the  last  alternative,  one 

proceeds  as  follows  after  each  instruction  fetch: 

(most  ^addresses  won’t);  lns  C  r  u  c  c  ion  address  h*‘  a"  attribute  list 

2-  If  not,  it  can  not  be  a  node  entry;  hence,  return: 

list;3'  SS3  lf  there  13  3  node-descriptor  object  on  the  attribute 


^ ~  If  not,  return: 

descr^t„r°Tre  tha  ?°df  star“"S  address  given  In  the  node 

sure  ,!  oblec'  "lth  the  current  Instruction  address  to  make 
sure  they  coincide. 

step  is^t^verv  ?teP  thlS  procedure  is  step  3,  and  even  that 
or  five  it  L,  7  nC.  there  usually  aren’t  more  than  four 

cost  of  t hi  ^  3“rlbute  list  of  any  location.  The  real 

cost  of  this  procedure  lies  in  the  inclusion  of  an  attribute  list 
pointer  potentially  for  every  object  machine  location 

the  1 ist 'of  Chr^  ^Pr°ach  for  detecting  a  node  entry  is  searching 
the  list  of  node  definitions  to  see  if  there  is  a  node  starting 

search  overall  st  Ctlon  address  >  then,  assuming  a  binarv 
search  over  a  list  of  n  node  descriptors  ordered  by  their  starting 

of  log^n!  aVerage  number  of  comparisons  will  be  on  the  order 

2 

Each  of  these  three  approaches  for  detecting  a  node  entrv 

executed  ch“k1**  C°  be  d0ne  -ery  instruction 

Thus  thP  J  «1  last  node  exit  until  a  new  entry  is  detected, 
s,  the  total  overhead  caused  by  any  one  of  the  three  is  also 

jff“nCd  °n  of  the  total  number,  Q,  of  such  instructions  executed 
If  we  denote  by  S  the  ratio  of  the  number  of  executed  object 
machine  instructions  which  belong  to  a  node  to  the  total  number 
object  machine  instructions  executed,  and  by  0  the  overhead 

NE 

?hLirh!rUCti°1’  CaUSed  by  the  Particular  approach  for  node  detectioi 
the  average  overhead  per  instruction  caused  by  the  ENTRYP 
procedure,  without  nested  nodes,  win  be  0  (l-s).  m.  formula 

r  NE 

code Swh ich  ?hat:  “  there  are  lar*e  aagments  of  executed 

overhead?  B  t0  “  "°de’  tMs  a  significant 


91 


The  procedure  for  detecting  the  exit  from  the  current  node, 
again  assuming  no  nested  nodes,  is  much  simpler  and  requires  a 
comparison  operation  after  each  instruction  in  the  node.  Thus, 
if  we  denote  by  0  the  cost  of  making  a  comparison,  the  overhead 

NX 

for  detecting  the  exit  from  a  node  is  0  S. 

NX 

In  addition  to  detecting  entry  and  exits,  there  is  a  cost 
for  creating  an  entry  in  the  Node  Trace  table  for  each  node 
executed.  Lee  us  denote  that  overhead  by  0  .  If  the  average 

NT 

number  of  instructions  per  node  instance  is  I  then  this  overhead 

NI 

is  0  S/I 

NT  NI 

5.3.2  The  Overhead  of  I/O  Set  Maintenance 

Now  let  us  consider  the  largest  component  of  cost  associated 
with  the  Node  Mechanism,  namely  the  construction  of  input-sets 
and  output  sets. 

The  construction  of  an  input-set  involves  the  following 
general  steps: 

11-  At  node  entry,  allocate  space  for  the  set, 

12-  After  every  fetch  operation,  determine  if  the  fetch 
address  is  already  in  the  input-set  or  the  output-set  (i.e.  if 

it  has  been  fetched  or  written  previously  in  this  node  instance), 

13-  If  not,  add  the  address  and  its  contents  as  an  element 
to  the  input  set  . 

The  construction  of  an  output-set  similarly  involves  the 
following  steps  : 

01-  At  node  entry,  allocate  space  for  the  set, 

02-  Before  each  store  operation,  determine  if  the  store 
address  is  already  in  the  output-set, 

03-  If  not,  add  the  address  (with  an  undefined  content)  as 
an  element  to  the  set, 

OA-  At  exit  from  the  node,  fill  in  the  current  contents 
of  all  the  addresses  in  the  output-set. 


Since,  in  general,  the  size  of  an  I/O  set  can  not  be  predicted 
in  advance,  some  decision  has  to  be  made  as  to  how  space  will  be 
allocated.  It  clearly  is  wasteful  to  obtain  new  space  for  each 
element  and  link  it  to  the  rest.  There  are  similar  problems 
with  completely  static  allocation.  The  best  procedure  seems  to 
be  some  kind  of  a  compromise  between  the  two.  (In  DAME,  this 
is  handled  by  obtaining  space  in  10-word  chunks,  each  word  to 
contain  an  (address,  value)  pair.  Unused  words  will  contain  -I 
in  the  address  half.  These  10-word  chunks  are  put  in  a  list. 

The  list-head  has  one  user  word  which  contains  the  index  of  the 
next  slot  in  the  last  member  of  the  list.)  Let  us  denote  by 
IS  and  IS  the  everage  everhead  for  creating  the  list  head  and 
L  E 

for  adding  a  new  element,  respectively. 

The  cost  of  determining  whether  or  not  an  address  should  be 
added  to  an  I/O  set  (i.e.  whether  it  is  a  new  or  an  existing 
address)  depends  strongly  on  the  implementation.  In  DAME,  this 
is  done  by  using  bits  16  and  17  (from  the  right)  of  the  PDP-10 
word  representing  an  -11  word,  to  indicate  those  words  which  are 
already  members  of  the  output  set  and  the  input  set  respectively. 
Hence,  the  overhead  amounts  to  testing  these  bits  of  each  word 
being  accessed,  and  possibly  setting  one  of  them.  If  we  denote 
by  w,  B  and  B  the  ratio  of  the  number  of  distinct  operands  to 
1  2 

the  number  of  total  operands,  the  overhead  of  testing  a  bit  and 
the  overhead  of  setting  a  bit,  respectively,  then  the  overhead 
of  this  approach  for  the  input  and  output  sets,  per  instruction 
inside  a  node,  is  2m(B  +wB  )  and  per  node,  it  is  21  m(B  +wB  ). 

12  NI  1  2 

Let  us  now  consider  the  case  when  the  implementation  does  not 
permit  this  approach  (i.e.  there  are  no  available  bits).  Let  us 
suppose  that  the  "brute  force"  method  of  searching  the  I/O  set 
to  determine  if  a  given  address  is  in  it  or  not  is  being  used. 
Whenever  an  address  is  generated,  the  average  number  of  existing 

elements  in  an  I/O  set  is  wml  /2,  the  number  of  comparisons 

NI 

caused  by  new  elements  is  w^ml  /2  and  the  number  of  comparisons 

NI 

caused  by  old  elements  is  (l-w)wml  /  4 .  Thus,  the  average  total 


number  of  comparisons  for  constructing  the  input  and  the  output 
sets  of  a  node  instance  using  this  approach,  assuming  that  the 
above  parameters  are  equal  for  both  input  and  output  sets,  is: 

2(w^ml  /2+(l-w)wmI  /4) 

NI  NI 

=wml  +(w^ml  / 2) 

NI 


93 


Then,  the  average  overhead,  0  ,  per  executed  object 

10 

machine  instruction  for  constructing  I/O  sets  is: 

0  =(S/I  )*(wml  +(w^ml  /2)) 

10  N I  N I  N I 

=  S wm  ( 1 +w / 2 ) 

where  S  and  I  are  as  before. 

N I 

We  are  now  in  a  position  to  give  an  estimate  of  the  average 
total  overhead,  0  ,  per  executed  object  machine  instruction: 

I 

0  =  C  + ( n  +  2 ) T  + ( m  +  2 ) ( T  +T  ) 

IB  S  OH  AH 

+0  ( 1 - S ) +0  S+0 
NE  NX  10 

where 

C  =  the  average  cost  of  emulating  one  object  machin' 

B 

instruction,  with  no  event  scheduling  or  checking  for  monitor  hooks, 

n=  average  number  of  main  memory  accesses  per  0M  instruction, 

T  =  the  cost  of  scheduling  an  event  and  activating  it, 

S 

m=  total  number  of  operands  per  0M  instruction, 

T  =  overhead  of  checking  for  a  general  hook, 

GH 

T  =  overhead  of  checking  for  an  addressed  hook, 

AH 

0  =  overhead  per  0M  instruction  of  detecting  a  node  entry, 

NE 

S=  ratio  of  OM  instructions  belonging  to  some  node  to  the 
total  number  of  executed  OM  instructions, 

0  =  overhead  per  OM  instruction  of  detecting  a  node  exit, 

NX 


94 


w  ratio  of  the  number  of  distinct  operands  to  total 
operands  generated  over  the  course  of  the  execution, 

0  -  overhead  per  executed  instruction  due  to  construction 

10 

of  I/O  sets. 

5.4  Measurements  of  the  DAME  Svs tern 

In  this  section,  some  measurements  of  the  overhead  of  the 
DAME  system  along  the  lines  outlined  above  will  be  presented. 

First,  a  disclaimer  note  is  in  order:  as  mentioned  previously, 

the  minimization  of  the  resource  requirements  was  not  a  primary 
goal  in  the  design  and  implementation  philosophy  of  the  DAME 
system  and  often  these  goals  were  neglected  in  favor  of  flexibility 
in  the  analysis  facilities  offered  in  order  that  new  and  useful 
features  may  be  discovered.  This  philosophy  has,  in  this  author's 
opinion,  met  its  goals.  On  the  other  hand,  the  performance  has 
been  worse  than  expected.  Thus,  the  real  purpose  of  this  section 
is  to  give  the  reader  an  idea  of  what  to  expect  in  the  way  of 
the  "relative",  rather  than  "absolute",  performance  of  a  DAME-like 
system  in  the  various  monitoring  and  analysis  tasks  on  which 
measurements  are  presented.  Clearly,  the  speed  of  any  component 
can  be  increased  by  better  coding  or  less  generality  or  by  the 
use  of  some  of  the  ideas  presented  in  the  final  section  of  Chapter  3. 

5.4.1  Performance  of  the  PDP-11  Simulator 

The  most  basic  observation  is  that  simulation  at  memory 
cycle  level  via  a  general-purpose  scheduling  mechanism  degrades 
the  performance  by  at  least  a  factor  of  3  over  emulation,  in 
which  no  scheduling  is  made.  In  the  DAME  system,  simulation  runs 
about  3000  times  slower  and  emulation  1000  times  slower  than  a 
PDP-11/20.  These  factors  include  about  a  25%  overhead  for 
checking  for  hooks.  These  figures  are  based  on  measurements  of 
the  time  charged  to  the  user  by  the  PDP-10  monitor,  which  includes 
supervisory  and  swapping  overhead  etc.  and  have  shown  a  deviation 
of  up  to  15%  in  both  directions. 

5.4.2  Node  Entry/Exit  Overhead 

If  input/output  sets  are  not  being  constructed,  the  overhead 
for  user— defined  nodes  amounts  to  3.2  milliseconds  per  node 
instance  for  entry  and  exit  combined  and  1.2  milliseconds  per 
node  instance  to  create  a  node  trace  entry,  for  a  total  of 
4.4  milliseconds  per  node  instance.  In  the  DAME  system,  these 
costs  have  been  found  to  be  only  associated  with  the  actual  entry 
and  exit  events;  the  cost  of  checking  for  entry  and  exit  with 
each  instruction  is  found  to  be  less  than  the  precision  of  the 
measurement  s . 


...  - - - - - -  - —  . . . . .  . . ■  ■  - — . . . .  . 


I 


5.4.3  Input/OutpuL  Set  Overhead 

When  I/O  sets  are  being  used,  there  is  an  added  overhead 
at  node  entry  and  exit,  of  about  40  milliseconds  each,  for 
creating  and  closing  the  I/O  sets.  In  addition,  the  overhead 
of  testing  each  generated  address  to  see  if  it  should  be  added 
to  the  input  or  the  output  set  amounts  to  about  1.3  millisecond 
per  fetched  or  stored  operand,  or  about  6  milliseconds  per  OM 
instruction  in  the  node  instance.  Thus,  for  a  node  instance  of 
5  instructions,  the  total  overhead  for  I/O  set  creation  and  main¬ 
tenance  would  be  (2*40)+(6*5)=110  milliseconds.  If  we  assume 
that  40  percent  of  all  the  executed  instructions  belong  to  some 
node  and  an  average  of  5  instructions  per  node  instance,  the 
total  overhead  for  nodes  and  I/O  sets  would  be  11.5  ms  per  executed 
instruction. 


For  a  PDP-11  simulator  with  a  slow-up  factor  of  3000,  assuming 
an  average  of  3.5  microseconds  of  real-time  per  instruction, 
this  amounts  to  an  additional  delay  factor  of  1.8. 


CHAPTER  6 


HIGH-LEVEL  LANGUAGES  FOR  EXECUTION  ANALYSIS 


One  of  major  shortcoming s  of  the  DAME  System  as  described 
in  Chapter  3  is  that  its  language  is  too  primitive  for  making 
arithmetic  calculations  and  certain  types  of  monitoring  opera 
tions.  This  fact  was  not  altogether  unexpected.  One  reason  for 
choosing  this  level  in  the  design  was  the  desire  to  avoid  inter¬ 
preting  by  software  a  complex  syntax  at  run-time.  A  second  reason 
was  the  anticipation  of  the  possibility  that  any  proposed  hard¬ 
ware  or  microcoded  implementation  of  a  DAME -like  facility  might 
employ  an  instruction  set  very  similar  to  this  one.  Hence  an 
effort  was  made  to  keep  a  major  part  of  the  instruction  set  simple 
enough  to  be  implemented  by  hardware  or,  more  probably,  by  micro¬ 
code.  However  certain  instructions  are  still  too  complex  and 
would  probably  be  best  implemented  by  software  (e.g.  Playback 
Values,  Replay  Node  Instance,  Type  Object  instructions). 

In  this  chapter,  I  would  like  to  discuss  some  issues  in  the 
design  of  high-level  languaees  for  execution  monitoring  and 
analysis.  The  emphasis  will  be  on  features  which  are  particularly 
relevant  to  this  application  area. 

The  general  structure  of  this  chapter  is  as  follows:  first, 

a  number  of  issues  related  to  the  human  engineering  aspects  of 
interactive  systems  and  languages  are  discussed  as  they  apply 
to  our  problem.  In  particular,  trade-offs  between  simplicity 
and  power  and  between  terseness  and  "r ememb e r ab i 1 i t y  (ease  of 
use)  are  outlined. 

Second,  the  major  data  elements  with  which  a  high-level 
execution  analysis  language  must  deal  and  the  appropriate  forms 
of  access  to  each  of  these  data  elements  are  taken  up. 

Finally,  the  problem  of  "continuously-evaluated  expressions 
is  discussed.  In  particular,  appropriate  control  structures 
for  the  continuous  evaluation  of  a  set  of  predicates  and  techni¬ 
ques  for  efficient  implementation,  as  discussed  by  D.  Fisher  in 
this  thesis  [Fi  70],  are  presented  and  evaluated. 


97 


6.1  Some  Human  Engineering  Issues 

Since  most  of  the  programming  in  the  analysis  level  will  be 
done  by  the  analyst  at  the  terminal,  almost  in  real-time,  without 
laboring  over  a  page  of  analysis  code  for  several  hours,  certain 
properties  of  the  total  interactive  system  become  very  crucial. 

The  issues  I  would  like  to  discuss  here  are  those  related  to 
this  aspect  of  the  design  of  the  language  of  the  analysis  facility. 

Due  to  the  hands-on,  "quasi-real-time"  nature  of  the  analysis 
programming  process,  it  is  clear  that  the  language  must  be  terse 
and  conducive  to  error-free  programming.  An  error  in  an  analysis 
procedure  can  be  "doubly  costly"  in  the  sense  that  it  not  only 
causes  a  wrong  computation  but  also  unnecessarv  periods  of 
(possibly  simulated)  execution  by  the  object  machine  which  it 
controls.  Especially  the  control  structure  of  the  analysis  lan¬ 
guage  is  an  important  factor,  because  of  possible  interaction 
between  the  control  flow  in  the  analysis  program  and  the  object 
program.  Another  complicating  factor  is  that  several  analysis 
actions  may  have  been  independently  scheduled  to  be  activated 
at  the  same  contact  point.  Hence,  whenever  these  actions  are 
sensitive  to  the  order  in  which  they  are  executed,  the  user  must 
have  explicit  knowledge  of  that  order  ana  must  be  able  to  modify 
it.  In  a  list-oriented  system  such  as  DAME,  this  is  extremely 
easy.  Here,  the  flexibility  of  a  loosely  structured  list  must 
be  weighed  against  the  execution  efficiency  of  a  more  optimized, 
tighter  representation. 

We  have  already  noted  the  need  for  terseness,  simplicity  of 
syntax  and  conduciveness  to  error-free  programming.  These 
objectives  can  conflict  with  each  other  when  any  one  of  them  is 
pursued  with  excessive  zeal.  For  example,  the  goal  of  terseness 
can  lead  to  a  design  where  the  user  has  to  remember  a  large 
number  of  special  symbols  as  operators  or  control  characters. 
Over-emphasis  on  simplicity  of  syntax  can  lead  to  either  weakening 
of  the  power  of  the  language  (as  one  goes  in  the  general  direction 
of  the  Turing  tar  pit)  or  to  the  definition  of  many  special 
symbols  which  have  to  be  remembered.  An  interesting  case  is 
presented  by  the  syntax  of  LISP.  It  neither  requires  memorizing 
a  large  number  of  special  symbols,  nor  can  the  language  said 
to  be  too  primitive.  Its  failing  however,  as  users  of  LISP  will 
painfully  testify,  is  the  extreme  reliance  on  balanced  and  properly 
matched  sequences  of  parentheses,  which  is  one  of  the  most  fre¬ 
quent  sources  of  simple  errors  in  LISP  programming.  Another 
virtue  of  LISP  is  the  fact  that  both  program  text  and  data  use  the 
same  basic  representatior :  namely,  list  structures.  This  feature 
facilitates  operations  on  programs  as  data,  e.g.  to  parse  them, 
generate  them  or  delete  them.  These  operations  are  more  difficult 
in  languages  where  the  syntactic  elements  of  the  language  can  not 


I 


98 


be  represented  in  one  of  the  dominant  data  types  or  data  struc¬ 
tures  defined  in  the  language  (it  must  be  noted  t  ?  a  t  most  primitive 
machine  languages  do  satisfy  this  requirement).  For  these  reasons, 
a  list-oriented  syntax  was  selected  for  DAME.  While  there  is 
much  room  for  improvement  in  it,  the  chosen  syntax  has  proved 
remarkably  flexible  and  resilient  under  demands  to  accomodate  more 
and  more  complex  instruction  forms.  A  e  o  o  d  example  of  this  is 
the  Search  List(SLIST)  instruction.  (See  Section  3.6.1  or 
Appendix  A:  Introduction  to  DAME  for  a  description  of  this  instruc¬ 
tion)  . 


I  would  now  like  to  consider  the  special-purpose  data  struc¬ 
tures  with  which  high-level  execution  analysis  languages  must 
deal  (i.e.  structures  unique  to  execution  analysis)  and  the  access 
methods  which  they  must  provide. 

6.2  High-Level  Data  Access  in  Execution  Analysis 

The  set  of  major  data  elements  with  which  an  execution  analy¬ 
sis  facility  must  deal  were  discussed  in  Chapter  2,  and  we  summa¬ 
rize  those  elements  here: 

(i)  The  external  state  of  the  Object  Machine 
(i.e.  main  memory  and  user-addressable  registers), 

(ii)  Some  parts  of  the  current  internal  state  of  the  OM, 

(iii)  Possibly,  user  program  text  and  symbol  table, 

( iv )  Structural  information  about  the  user  program 
(e.g,  its  nodes), 


(v)  Empirical  data  associated  with  each  component  of  the 
structure  (e.g.  I/O  sets  of  node  instances,  data  created  by  user 
at  run- time) , 


(vi)  Execution  history, 

(vii)  Analysis  program  text, 


(viii) 
actions  and 


Representation  of  the  association  between  analysis 
contact  points, 


(ix)  Entities  holding  intermediate  results  of  analysis 
computat ions . 


I  shall  now  discuss  appropriate  forms  of  high-level  access 
to  each  of  these  elements. 


(i)  The  external  state  elements  should  be  accessible  bv 
explicit  addressing,  e.g.  core[2000J,  by  computed  addresses, 


99 


e.g.  core[A+B],  through  Object  Machine  pointers  (e.g.  core[A+ 
core  L  B  J  j ) ,  as  well  as  in  blocks  (e.g.  core[A:B]  <  0,  where  A : B 

denotes  'A  to  B’,  or  co r e  [  100  :  200  ]  «-  c or e [ 300  :  ^ 00 ] )  .  User- 
addressable  registers  should  be  accessible  by  their  mnemonic 
names  used  in  the  assembly  language  as  well  as  by  their  memory 
addresses  where  such  addresses  exist. 

(ii)  Those  elements  of  the  internal  state  of  the  object 
machine  which  contain  the  various  fields  of  the  current  instruction 
(e.g.  opcode,  source  operand,  destination  operation)  should  be 
accessible  by  suitable  mnemonics. 

(iii)  Access  to  user  program  text  makes  possible  such  things 
as  building  a  text  ed i t o r / inc r erne n t a 1  assembler  into  the  analysis 
facility  so  that  corrections  to  user  programs  may  be  made  as 

they  are  discovered,  rather  than  saved  until  the  end  and  made  in 
a  seperate  operation.  The  availability  of  the  user  symbol  table 
clearly  facilitates  communication  between  the  user  and  the 
analysis  facility  by  permitting  the  use  of  the  symbols  appearing 
in  the  user  program.  One  or  both  of  these  facilities  are  aval 
able  in  several  systems  though  not  in  DAME  (e.g.  See  Lampson  LLa  65 
Evans  and  Darley  [ED  65].  For  a  comparative  discussion  of  various 
techniques  related  to  this  topic,  see  Evans  and  Darley  LED  66J.) 

(iv)  Structural  information  about  the  user  program  describes 
the  components  of  that  program,  as  they  have  been  defined  by 

the  user  or  determined  by  the  system  for  the  purposes  of  analysis, 
and  the  relationships  between  the  components  (e.g.  predecessor/ 
successor,  outer/inner  node  relations).  This  can  be  in  tabular 
form  or  in  the  form  of  "descriptor  objects"  (as  in  DAME)  which 
can  be  manipulated  by  1 i s t -p r oc e s s ing  functions.  In  any  case, 
it  should  be  possible  to  reference  these  descriptions  explicitly 
(e.g.  "node  A"  or  "the  node  starting  at  location  10000  ) ,  by  a 
computed  address  (e.g.  "the  node  starting  at  the  location  pointed 
by  contents  of  location  10000  +  contents  of  register  3  ),^or  as 
elements  of  a  list  or  table  satisfying  a  predicate  (e.g.  all 
nodes  between  10000  and  12000").  Better  yet,  the  user  can  be 
given  a  facility  for  stepping  through  the  component  descriptions 
in  a  systematic  way  and  computing  arbitrary  functions  using  the 
various  fields  within  each  description,  with  the  ability  to  e*it; 
the  search  at  any  point  or  have  it  terminated  automatically  when 
the  end  of  the  table  or  list  is  reached. 

(v)  Empirical  data  generated  during  the  execution  of  the 
user  program  should  be  linked  with  the  phase  of  the  execution  to 
which  they  relate  and  they  should  be  accessible  by  the  user  through 
that  link.  An  example  of  such  data  is  the  input/output  sets 
of  a  node  instance  in  DAME.  These  sets  are  accessible  directly 
via  pointers  contained  in  the  entry  for  the  associated  node 
instance  in  the  node  trace  table,  as  well  as  through  the  chrono 


logical  list  of  pointers  to  a  node's  I/O  sets,  pointed  by  the 
node  object  itself.  This  process  of  linking  empirical  data 
with  the  associated  portion  of  the  execution  history  can  be 
done  by  the  Analysis  Facility  for  specific  types  of  data  which 
the  system  knows  about  (e.g.  I/O  sets);  but  the  user  should 
also  have  a  way  of  doing  the  same  thing  for  arbitrary  data. 

.•n  example  of  the  latter  case  is  where  the  user  wants  to  attach 
to  each  node  a  list  of  the  addresses  of  every  unique  successor 
of  that  node  in  an  analysis  of  control  flow.  This  reauires,  for 
example,  that  whenever  a  new  node  is  entered,  the  user  be  able 
to  locate  and  search  the  currenc  members  of  the  successor  list 
of  the  last  node  for  the  address  of  the  new  one,  and  if  it  is 
not  found,  be  able  to  add  it  there.  Such  a  mechanism  may  be 
implemented  in  terms  of  a  more  general  associative  search  facility, 
such  as  LEAP  in  SAIL  (See  Feldman  and  Rovner  [ FP  69]).  This 
associative  search  facility  would  permit  LEAP-like  statements 
such  as: 

FOREACH  X,Y,Z  SUCH  THAT  <condition>  AND 

fcondition>  ...  DO  <statement>; 

where  X,  Y  and  Z  may  be  nodes,  node  instances,  I/O  sets  of  any 
of  the  other  defined  object  types  in  the  system.  (Note  that 
one  would  probably  prefer  terse,  single-character  symbols  for 
•"he  FO  REACH ,  SUCH  THAT  and  AND  in  an  interactive  language). 

(vi)  The  execution  history  information  represents  essentially 
a  v a r  i  a b 1 e - 1 ev e 1  trace,  where  the  precise  level  depends  on  the  , 

structure  which  has  been  defined  over  the  user  program.  Thus, 
the  time  grain  and  the  volume  of  collected  data  is  under  user 
control.  In  addition  to  the  normal  maintenance  by  the  Analysis 
Facility  over  the  course  of  the  execution,  this  data  will  also 
be  accessed  by  user  analysis  routines.  The  form  of  the  access 
should  be  similar  to  the  preceding  case;  e.g.  applying  a  function 
f  to  a  sequence  of  elements  in  the  execution  history  which  satisfy 
a  user  predicate  p.  The  function  f  and  the  predicate  p  may 
involve  both  the  history  information  itself  (e.g.  the  address 
of  the  k  nodes  executed  prior  to  the  last  execution  of  node  A) 
or  the  empirical  data  dynamically  associated  with  it,  as  described 
in  (v).  For  this  purpose,  the  analysis  language  should  have  a 
facility  for  searching  over  the  execution  history  events,  backward 
or  forward  in  time,  and  apply i. n g  predicates  to  each  event  encoun¬ 
tered. 

! 

In  the  last  two  paragraphs,  two  distinct  ways  for  performing 
searches  over  execution  history  ar.d  associated  empirical  data 
have  been  proposed.  It  is  beneficial  to  recap  them  at  this  point; 
one  way  is  to  build  into  the  language  hiph-level  associative 
search  facilities  such  as  those  of  LEAP,  and  the  other  is  to  give 


i 
! 

I 

. J 


101 


the  user  lower-level  mechanisms  such  as  the  Search  List  instruction 
in  DAME,  which  systematically  Rive  the  user  the  next  element  of 
the  list  being  searched  and  test  to  see  if  the  user  wishes  to 
terminate  the  search  or  not.  If  the  first  facility  is  provided, 
then  clearly  the  language  must  possess  a  fairly  sophisticated 
list-search  mechanism.  In  such  a  case,  the  system  might  as  well 
give  the  user  the  second,  lower-level  ability  too,  since  this 
would  be  at  almost  no  additional  cost  to  the  system  and  there 
will  probably  be  a  number  of  cases  where  this  lower-level  ability 
will  be  much  more  useful  or  efficient  for  the  user. 

(vii)  The  analysis  program  text  and  possibly  its  internal 
representation  will  be  of  interest  to  the  user  in  such  cases  as 
when  he  wants  to  see  the  texts  of  the  actions  associated  with  a 
particular  type  of  access  to  a  location  or  to  edit  or  patch  an 
existing  analysis  routine.  Thus,  it  is  important  that  the  analysis 
facility  contain  an  on-line  editor  for  analysis  text  which  can 
also  be  invoked  under  program  control. 

(viii)  In  addition  to  accessing  the  text  of  analysis  routines, 
the  user  should  be  able  to  access  a  list  of  the  names  of  analysis 
actions  associated  with  a  particular  address  or  contact  point. 

This  is  important,  for  example,  in  avoiding  duplicate  entries 
for  the  same  analysis  routine  or  in  determining  in  what  order  the 
actions  associated  with  an  address  or  contact  point  should  be 
arranged,  e.g.  to  optimize  the  set  of  analysis  actions. 

Clearly,  if  the  syntax  of  the  analysis  language  and  the  form 
of  these  associations  fall  into  one  of  the  dominant  data  types 
handled  by  the  analysis  language,  very  little  additional  machinery 
will  be  mecessary  to  give  the  user  the  abilities  mentioned  above. 

(ix)  In  the  course  of  analysis  computations,  the  user  will 
often  want  to  hold  temporary  results  in  local  (or  transient) 
variables.  Depending  on  the  kinds  of  entities  manipulated  in  the 
amputation  (e.g.  lists,  arrays,  strings),  the  user  will  need 
to  create,  and  later  delete,  entities  of  appropriate  type  for 
this  purpose.  In  a  highly  modularized  style  of  programming,  such 
as  we  expect  analysis  programming  to  be,  it  is  very  desirable 
to  have  local  variables,  if  for  no  other  reason  than  the  very 
practical  one  that  whenever  one  defines  a  new  variable,  one 
would  like  to  be  sure  that  one  is  not  clobbering  an  already  existing 
variable  with  the  same  name,  which  may  have  been  defined  by  any 
one  of  the  number  of  routines  used  in  the  computation.  Thus, 
through  the  use  of  local  variables,  painful  searches  of  all  the 
used  analysis  routines  for  each  new  identifier  to  be  created  can 
be  eliminated. 


I  0  2 


b  •  1 Co  n  t  i  11  u mis  L  v  a  1  u  a  t  i  u n  o  I  Kxjiross  i  ons 

Ono  of  the  main  functions  of  a  high-level  execution  analysis 
language  should  be  to  t  a  c i  I  i t  a  t  e  the  description  of  e  x  e  c u t ion 
e vents  which  the  user  w i  shes  to  w a  t c h  for.  These  events  can 
generally  be  expressed  as  a  change  in  the  value  of  a  predicate 
Irom  FALSI!  t  o  TRUE.  Sucli  a  predicate  can  involve  arbitrary 
functions  over  the  data  elements  discussed  in  the  previous  section. 
I  h  e  continuous  mo  n  i  tor  i  n  g  o  f  predicates  w  a  s  d  i  s  c  ussed  b  v  D  .  F  i  s  li  e  r 
i u  his  thesis  I  F i  701.  In  t  h i s  a n  d  the  next  s ect  ion  ,  1  d  i  s c  u  s  s 

the-  basic  procedure  for  implementing  continuously  evaluated  expres¬ 
sions  as  described  by  Fisher,  as  well  as  some  points  not  direct lv 
addressed  h v  him. 

I  shall  start  b v  discussing  the  overall  control  flow  in  the 
continuous  evaluation  o f  a  set  of  predicates,  deferring  the  dis¬ 
cussion  of  efficient  techniques  (or  the  continuous  evaluation 
of  individual  predicates  until  the  next  section. 

Normally,  when  one  of  these  events  takes  place  (i.e.  the 
value  of  one  of  the  monitored  predicates  becomes  TRUE),  some 
action  is  taken.  Then  the  question  arises:  "Should  the  sane 

predicate  be  now  re-evaluated,  because  tbe  action  may  have  changed 
its  value  again?"  More  generally,  the  question  is:  "What  should 

be  the  control  structure  for  the  continuous  evaluation  of  a  set 
of  predicates?"  I  shall  denote  by  S: 'predicate-  •  action*  the 
specification  S  that  action  has  to  be  executed  when  -predicate 
becomes  TRUE.  Consider  , or  example  the  following  specification: 

A : (b  0)  •  (b  «  b+1  ,i 

where  b  is  an  analysis  system  (not  object  machine)  variable. 

This  specification  will  cause  no  changes  in  the  state  of  the 
analysis  system  until  b  exceeds  zero.  But  after  the  first  time 
the  predicate  is  found  to  be  TRUE,  what  happens  to  the  system 
depends  on  whether  or  not  the  predicate  is  evaluated  again  imme¬ 
diately  following  the  action  b  <■  b  +  1.  If  it  is,  clearly  the 
system  will  fall  into  an  infinite  loop  (infinite  for  all  practical 
purposes,  unless  this  is  avoided  in  some  special  cases  by  a  quirk 
in  tbe  number  representation  in  the  system,  e . g .  adding  one  to 
the  largest  possible  positive  integer  result i  in  zero,  or  some 
similar  event).  This  is  clearly  a  very  unde. sirable  situation. 

If  the  predicate  is  not  re-evaluated,  the  infinite  loop  does 
not  result.  However,  in  order  to  accomodate  the  case  where  the 
user  may  wish  to  continue  the  "predicate  evaluation-action"  loop 
until  the  predicate  returns  FALSE,  a  WHILE  <  predicate*  DO  (or 
an  equivalent  construct)  should  be  available. 


103 


We  still  have  not  answered  our  general  question  regarding 
the  control  structure  of  the  predicate  evaluation  mechanism 
in  full.  We  have  determined  that  changes  to  analysis  system 
variables  should  not  cause  a  re-evaluation  of  the  predicates. 

How  about  changes  to  the  object  machine  state  by  analysis  actions 
--  should  these  changes  cause  a  re-evaluation  just  as  if  the 
change  were  caused  by  the  execution  of  the  object  program  itself? 

If  so,  we  can  again  have  the  same  infinite  loop  problem.  On  the 
other  hand,  by  ruline  this  out,  we  are  ruling  out  an  important 
class  of  functions  which  the  analysis  facility  should  be  able 
to  perform:  namely,  faithful  mimicking,  by  analysis  code,  of 
the  effects  of  a  piece  of  object  code.  For  example,  a  predicate 
which  tests  the  contents  of  object  machine  location  X  should  be 
able  to  be  activated  whether  the  contents  of  X  is  changed  by  the 
user  program  or  a  model  (in  the  language  of  the  analysis  facility) 
of  that  program.  Thus,  this  seems  to  be  a  desirable  ability  to 
have.  However,  there  are  some  other  points  also  to  be  considered: 
How  do  we  handle  changes  to  several  variables?  How  do  we  handle 
multiple  changes  to  the  same  variable  in  the  action  associated 
with  a  single  predicate? 

Considering  the  multiplicity  of  requirements  that  a  high-level 
rule  for  this  purpose  would  have  to  satisfy,  the  best  policy 
seems  to  be  to  let  the  user  decide  what  he  wants  to  do,  i.e. 
give  him  the  ability  to  test  if  there  are  any  predicates  involving 
a  particular  OM  variable  and,  if  so,  to  evaluate  those,  predicates 
and  take  any  associated  actions  whenever  he  chooses  to  do  so. 

It  may  be  desirable  to  have  a  high-level,  operator  to  do  all  of 
this  for  a  given  symbol,  e.g.  an  operator.  CHECK(X),  may  serve 
this  purpose,  as  in: 

[  1 1  :  ( A  >  5  )  -  (B  A;CHECK(B):A  «-  C:C  -  D  ;  CHECK  (  C)  )  : 
where  we  have  applied  CHECK  to  B  and  C  but  not  to  A. 

One  possible  model  for  this  control  structure  appears  to  le 
that  of  Markov  Algorithms.  In  this  model,  predicate  evaluation 
is  halted  after  the  first  predicate  with  a  value  of  TRUE  has 
been  found  and  the  associated  action  has  been  taken.  Following 
the  next  change  in  the  object  machine  state,  predicate  evaluation 
starts  again  from  the  top,  i.e.  with  the  first  predicate. 

A  second  possible  model  is  one  in  which  every  predicate  is 
evaluated  with  every  change  in  object  machine  state  and  the  actions 
associated  with  every  predicate  whose  value  is  TRUE  is  executed, 
in  static  order . 

Either  model  is  feasible  for  this  purpose.  However,  it  is 
clear  that  the  evaluation  of  even  one  (arbitrarily  complex) 
predicate  after  every  change  in  the  object  machine  state  can  be 


104 


unbearably  expensive,  unless  some  very  efficient-  t-Pnh  • 
found  to  perform  the  evaluation.  In  fact  techntques  are 

are  found,  the  ideas  discussed  so  far  in  Jh?  1  fUCh  technlc3UGs 
not  be  implemented  in  a  useful  wav  n  t- S*Ctinn  Just  —  uld 

may  seem  like  "implementat  iL  details  "to  ,theSe  t6Chniques 

^  tb.  essence ,  since  without  them  these  id  III  H  ttllu 


6.4 


^l^ESntat^of  Con  t  inuousl^alua  ted  Expressing 


connection,  i  s  ’  t  o  ^onj  t  or  6  t  he  "  ac  c  e  s  si  s  \  n  t  o  ^  'h  mlnd  thiS 

m  a  predicate  and  whenever  such  a  variable  Variable  aPPearing 

re-evaluate  the  predicates  i n  u  ,  arable  changes  value,  to 

otdet.  Thus ,  i„n  s.lc;.“0;p?%*rr:;.in  the 

pairs:  set  0 1  (  -predicate>  ->  <action->) 

I  1-1:  ((X -5)&((Y+Z)  <0))  ->  f  . 

(2J:((X<5)&(W<Y))  -*•  f  ;  1 

2 

“/xrs.iL1  T\be  »*«■«  «c 

.edification  of  X.  and  m  »lth  each^dUic^i^M  S“h  "'h 

This  implementation  can  be  further  ^  , 

that  whenever  the  value  of  a  variable  J  *  ’  bv  observing 

P  is  changed,  we  do  not  necessarilv  aPPearing  in  a  predicate 

P-  First  we  can  evaluate  those  tC  re‘evaluate  alb  of 

and  only  if  one  of  thnse  *-  terms  in  P  in  which  V  appears: 

any  more  evaluations  This^  C  an^6S  value  do  we  need  to  make 
the  predicate  and  a  gen^alizaMn  -presentation  of 

to  one  where  a  node  in  level  L  in  thf  Ir*  evaluation  rule, 

and  onl v  if  any  of  its  is.,..  .  n  the  tree  is  re-evaluated  if 
modified.  immediate  descendants  (in  level  1+1)  ls 

with  thfUcondIt[o“"heret;heUsa»ettermyoc'  F1Sher  F 1  701>  deals 
a  subtree.  Consider  for  exempli  tie  Ixp"""^6  ^  *" 


( ( X+Y )/ ( X+Z ) ) * ( X+Y+ Z ) 


-Ill/nyiini;:::10"-  f0ll°“lnR  precedence 


Here,  d  's  are  dummy  variables  introduced  to  hold  the  current 
i 

value  of  each  subexpression.  Let  us  suppose  that  the  value  of  X 
has  just  changed  and  that  we  wish  to  propagate  that  change  through 
the  whole  expression.  If  we  follow  the  rule  given  above  in  a 
depth-first,  lef t-to-right  fashion,  we  proceed  as  follows: 

At  level  0,  we  substitute  the  new  value  of  X  into  the  leftmost 
instance  of  X.  We  evaluate  d  using  the  old  value  of  Y.  If 

1 

d  changes,  we  evaluate  d  using  the  old  value  of  d  .  If  d 
1  3  2  3 

changes,  we  evaluate  d  using  the  old  value  of  d  .  Then,  going 

6  5 

back  to  level  0,  we  substitute  the  new  value  of  X  into  the 
expression  for  d  and  evaluate  it  using  the  old  value  of  Z;  if 

2 

ti-e  value  of  d  changes,  we  evaluate  d  again.  If  d  changes, 

2  3  3 

we  evaluate  d  again  using  the  old  value  of  d  .  Then,  after 
6  5 

similarly  proceeding  up  the  right  subtree  with  the  new  value  of 
X,  we  evaluate  d  a  third  time. 


Thus,  in  this  expression,  we  have  evaluated  one  subexpression 
d  ,  twice  and  another,  d  ,  three  times.  In  general,  if  a  strict 

3  6 

depth-first  search  is  followed,  then  for  each  new  value  assigned 
to  a  variable  X,  every  node  e  in  the  evaluation  tree  will  be 
evaluated  n  (X)  times,  where  n  (X)=number  of  occurrences  of  X 
e  e 

in  node  e. 

Clearly,  the  reason  for  the  unneccessary  computations  is 
the  fact  that  at  each  node  we  do  not  wait  for  the  entire  sub-tree 
under  the  node  to  be  evaluated.  Thus,  a  breadth-first  search 
for  terms  to  be  evaluated,  where  whenever  the  value  of  a  term 
changes  as  a  result  of  re-evaluation,  a  flag  bit  associated  with 
its  immediate  ancestor  is  turned  on  and  its  own  flag  bit  is 
turned  off,  would  eliminate  the  unnecessary  evaluations.  Thus, 
in  the  preceding  example,  initially  all  flags  would  be  turned 
off.  Then,  when  the  value  of  X  changed,  the  flag  bits  of  the 
X's  in  level  0  would  be  turned  on.  The  new  value  of  X.  will  be 
plugged  into  each  of  its  occurrences  in  level  0,  also  turning  on 
the  flags  of  d  ,  d  and  d  (since  each  is  an  immediate  ancestor 
12  4 

of  X  in  level  0).  The  evaluation  would  then  move  to  level  1. 

If  the  value  of  either  d  or  d  is  changed,  the  flag  of  d  would 

12  2 
be  turned  on.  Similarly,  if  the  value  of  d  changed,  the  flag 

4 

of  d  would  be  turned  on.  The  evaluation  would  stop  either  after 
5 

the  root  node  has  been  evaluated  or  when  no  more  flag  bits  which 
have  been  turned  on  can  be  found. 

This  breadth-first  search  (described  by  Fisher)  thus  avoids 
the  unnecessary  computations  of  the  earlier  depth-first  procedure 
by  using  an  additional  bit  of  information  associated  with  each 
node  of  the  tree  to  guide  itself  to  those  nodes  whose  values 
could  possibly  change  due  to  the  change  in  one  of  their  ancestors. 

Other  types  of  optimizations,  such  as  recognition  of  common 
subexpressions  (e.g.  d  and  d  in  above  example),  could  help  to 

1  4 

further  reduce  the  amount  of  computation  involved. 

It  must  be  noted  that  our  ability  to  determine  the  leaf 
nodes  of  the  execution  tree  which  are  affected  by  the  change  in 
the  value  of  a  variable,  X  in  the  above  example,  depended  on  our 
ability  to  statically  locate  all  the  occurrences  of  that  variable 
in  the  whole  expression  When  this  is  not  possible  or  practical, 
the  above  procedure  can  not  be  used.  Examples  of  such  a  case 


107 


arc  function  calls  or  coroutine  jumps.  In  the  case  of  function 
calls,  if  the  name  of  the  function  is  statical lv  fixed,  e  .  g  . 
f(X),  then  one  can  conceivably  locate,  at  compile-time ,  the  text 
of  the  function  and  see  if  it  uses  the  variable  whose  value  just 
changed.  If  the  function  call  is  to  a  dynamically  computed 
address,  this  has  to  be  done  at  run-time,  introducing  substantial 
overhead  . 

Another  example  of  a  case  where  it  may  be  impossible  to 
identify  all  possible  occurrences  of  a  term  is  accesses  to  a 
dynamically  computed  address.  Any  term  involving  the  contents 
of  a  dynamically  computed  address  should  be  checked  after  a 
change  in  the  value  of  any  object  machine  variable  to  see  if 
the  dynamically  computed  address  is  equal  to  that  of  the  variable 
whose  value  just  changed.  Alternately,  instead  of  computing 
the  dynamic  address  with  each  change  in  the  object  machine  state, 
that  address  itself  could  be  maintained  by  the  continuous  evalua¬ 
tion  techniques  described  above. 

This  last  example  illustrates  a  cascaded  two-level  continuous] v- 
evaluated  expression  and  provides  an  example  of  hierarchical 
systems  of  such  expressions  as  envisioned  by  Fisher. 


108 


CHAPTER  7 


E XECUTION  ANALYSIS  FAC.1LI TIES  FOR  ALGOL-LIKE  LANGUAGES 


My  only  hands-on  experience  with  the  implementation  of  the 
presented  ideas  on  monitoring  and  modelling  has  been  with  programs 
written  at  the  level  of  the  dominant,  contemporary  central  pro¬ 
cessor  instruction  set.  In  this  chapter,  I  would  like  to  consider 
the  translation  of  these  ideas  to  the  class  of  languages  which 
has  come  to  be  called  ALCOL-like  languages,  which  are  a  subset 
of  "problem-oriented"  or  "procedural"  languages. 

It  must  be  emphasized  that  the  intent  of  this  chapter  is 
not  to  present  a  design  specification  for  execution  analysis 
facilities  for  any  specific  high-level  language,  but  to  explore 
the  basic  problem  areas  uniquely  associated  with  this  area.  Hence 
the  level  of  detail  will  be  much  less  than  that  of  Chapter  3  in 
which  the  design  of  a  particular  prototype  system  was  discussed; 
but  hopefully  enough  ground  will  be  covered  to  provide  a  starting 
base  for  the  researcher  or  designer  interested  in  this  area. 

7,1  The  Added  Complexity  of  High-level  Langua_g.es. 

In  some  sense,  since  a  language  and  its  abstract  processor, 
which  may  be  called  its  "machine",  are  two  sides  of  the  same  coin, 
there  should  be  no  conceptual  difficulty  in  translating  the^ 
techniques  we  have  discussed  in  the  preceding  chapters  for  machine 
language"  to  any  language  for  which  a  machine  exists  or  can  be 
built.  The  only  difference  is  that  the  machines  for  ALGOL-like 
languages  are  much  more  complex  than  the  machines  considered  so 
far.  Hence,  some  things  which  were  very  easy  to  handle  before, 
now  become  difficult.  Let  us  consider  the  added  complexity  in 
three  parts: 

1-  Syntactic  complexity, 

2-  Semantic  complexity, 

3-  Language  implementation  complexity. 

I  shall  now  proceed  much  like  in  Chapters  2  and  3  to  discuss 
these  areas  first  in  abstraction,  then  in  reference  to  a  particular 

language. 


109 


7.1.1  On  Increased  Syntactic  Complexity 

The  increased  syntactic  complexity  arises  from  the  fact  that 
the  analysis  facility  has  to  be  able  to  understand  (parse)  each 
statement  in  the  source  program  at  run-time.  For  example,  one 
would  like  to  be  able  to  say,  at  run  time  :  "Trace  on  the  TTY 
the  brancli  taken  by  everv  IF  statement",  or  "type  the  contents 
of  X  and  Y l  1  , 2  J  at  entry  and  exit  from  any  loop  in  the  routine 
R"  or  "for  every  'else  clause'  which  is  executed,  type  the  values 
of  all  the  variables  in  the  Boolean  expression  in  the  associated 
'if  clause'",  or  "type  the  values  of  the  operands  of  all  floating 
point  divide  operations,  except  in  routine  P". 

7 .1.2  On  Increased  Semantic  Complexity 

An  example  of  added  semantic  complexity  is  dealing  with 
scope  rules  and  storage  allocation.  When  specifying  a  monitoring 
action  on  a  local  variable,  one  has  to  qualify  the  variable  name 
with  an  identification  of  the  block  in  which  it  is  declared  as 
a  local  variable  and  in  which  the  monitoring  action  is  to  be 
applicable.  Further,  if  the  block  is  executed  recursively,  then 
one  also  has  to  specify  which  "generations"  or  "incarnations" 
of  the  variable  one  is  referring  to.  Similarly,  references  to 
the  actual  parameters  of  a  routine,  "own"  variables,  values  re¬ 
turned  by  expressions  etc.  must  be  carefully  qualified  to  ensure 
reference  to  the  correct  data  element. 

Another  example  of  semantic  complexity  with  some  high-level 
languages  is  the  interpretation  of  data  types;  e.g.  checking 
the  data  types  of  the  actual  parameters  of  a  routine  inside  the 
routine. 

7.1.3  On  Complexity  due  to  Language  Implementation  Techniques 

Clearly,  an  important  question  which  comes  up  when  one  tries 
to  envision  how  such  an  analysis  facility  might  be  implemented 
is  whether  the  source  language  is  to  be  interpreted  or  compiled. 

A  form  of  compilation  called  "incremental  compilation",  in  which 
each  statement  is  compiled  as  independently  of  the  rest  of  the 
program  as  possible  and  and  control  is  returned  to  a  run-time 
monitor  after  each  statement,  is  a  convenient  compromise  which 
permits  us  to  reap  some  of  the  benefits  of  both  the  efficiency 
of  compiled  code  and  the  flexibility  of  interpretation. 

To  be  able  to  recognize  at  run-t/.me  the  code  corresponding 
to  the  various  parts  of  a  source  statement  in  a  compiled  program 
(compiled  by  a  non-optimizing  compiler),  requires  some  kind  of 
intermediate-level  representation  of  the  structure  of  the  object 
program.  This  representation  may  be  a  directed  graph  produced 
at  compilation  time.  Given  such  a  representation  of  the  source 


...  jf.,--..'..- . 


110 


plrts  VthI  oM  analysis  facility  can  locate  the  relevant 

of  »  /  object  program  and  arrange  for  appropriate  types 

of  t  r a, Ping  or  supervisor  call  operations  when  those  parts  are 

run-time’  Thi3^’  the  Sy™bo1  table  must  be  available  at 

time.  This  approach  also  reouires  that  the  structural  re- 

3nd  th£  symbo1  table  for  each  external  (e.g.  library) 
routine  that  is  used  must  also  be  available  at  run-time. 

°n  the  other  hand,  a  "pure  interpreter",  which  essentially 
p  arS-  each  statement  every  time  it  is  executed,  would  not 
need  such  an  intermediate  structural  representation,  at  least 

reoj?  °dyhooJnCe  !ff°urt  t0  Parse  the  Pr°gram  to  insert  the 

^  °  s  ls  ne<sl-1glble  compared  to  the  continvous  parsine 

tt*  \°f  fth^interPretive  execution.  It  will  of  course  have" 
tion!  b  available  which  it  needs  itself  during  the  execu- 

raM-n86?"6611  thG  extremes  of  Pure  compilation  and  pure  interpre¬ 
tation  lies  a  spectrum  of  modes  of  execution  involving  varying 

degrees  of  compiled  and  interpretive  representation.  In  existing 

are  °°  the  P«e  compilation 

Y  ^  !  wblch  requires  no  run-time  packages  (except  I/O, 

r1^3  Va1Su  CAa  Part  °f  the  lanPuaRe)  and,  on  the  pure  inter- 
n  end.’  by  APL*  whose  right-to-left  scanning  direction  and 
precedence  rule  come  very  close  to  making  it  a  "post-fix  operator" 

ng  age.  In  between,  lie  languages  (more  accurately,  their 
rum \  implementations)  ]  ike  PL/I,  which  requires  an  elaborate 

can  intermix^  -t  15  3  ComPiler  language,  and  LISP  which 

intermix  compiled  and  interpretive  execution. 

In  this  multi-dimensional  space,  for  which  I  can  conceive 
o  no  significantly  helpful  metric  for  the  purposes  of  this  research, 
shall  pick  a  much  smaller  subspace  in  the  hope  of  discussing 
ts  dimensions  in  a  somewhat  more  systematic  way.  That  is  the 
space  of  purely-interpreted  languages.  One  reason  for  this  choice 

fnr  co"ceIvable  to  use  an  interpreter  for  any  language 

for  program  development  and  testing  purposes  (or  this  could  be 
an  incremental,  debugging  compiler).  Secondly,  this  choice  permits 

cia^PdaV  •  ^uestions  elated  to  code  generation  and  the  asso- 
c  ated  mappings  between  the  source  code  and  the  object  code. 

Thus  we  can  concentrate  on  the  functional  requirements  (in  the 
spiri  o  Chapter  2)  rather  than  implementation  techniques.  Thus 

in  ^  3PPlV  the  functional  requirements  outlined 

uam?  f  2  d  uhe  concePts  used  in  their  realization  in  the 
DAME  system,  to  the  specification  of  execution  analysis  facilities 
for  interpreter-based  languages. 


WWW- 


Ill 


7.2  Execution  Analysis  Facilities  for  Interpreter-based 
Languages 

Let  us  recall  the  classes  of  required  capabilities  we  estab¬ 
lished  in  Chapter  2  for  a  general  purpose  execution  analysis 
facility  : 

1-  The  information  to  which  the  analysis  facility  has  access 

2-  The  points  in  the  execution  cycle  at  which  it  can  gain 
control  , 


3-  Its  instruction  set, 

4-  External  appearance  and  miscellaneous  useful  features. 

The  information  to  which  the  analysis  facility  should  have 
access,  can  be  considered  in  two  subclasses: 

(i)  Information  about  the  execution  history  of  the  particular 
program, 

(ii)  Information,  some  may  call  it  "intelligence",  about 
the  syntax  and  the  semantics  of  the  source  language,  as  discussed 
earlier  in  this  section. 


Subclass  (i)  is  genericallv  not  very  different  from  the 
corresponding  requirement  for  low-level  machine  languages;  namely, 
the  information  needed  to  efficiently  reconstruct  any  past  machine 
state.  In  this  case,  of  course,  the  "machine"  is  that  defined 
by  the  source  language. 

In  order  to  give  more  concrete  content  to  what  is  meant  by 
a  "machine  state"  in  the  case  of  a  high-level  machine,  let  us 
divide,  as  we  did  before,  the  machine  state  into  two  parts:  the 
"state  of  the  memory",  i.e.  the  values  of  all  the  variables  defined 
so  far,  and  the  "current  instruction".  The  question  now  becomes: 
"What  is  an  instruction  in  a  high-level  machine?".  The.  question 
arises  because  in  low-level  machine  languages,  an  instruction  was 
an  easily  identifiable  unit,  which  performed  a  very  small  number 
of,  sometimes  only  one,  indivisible  operations,  usually  involving 
up  to  three  or  four  operands,  including  side-effects.  Further, 
what  is  called  a  "machine  language  program"  consisted  of  a  sequence 
of  machine  instructions.  Hence  the  machine  instruction  turned 
out  to  be  a  convenient  unit  for  denoting  state  changes.  Clearly, 
what  is  usually  called  a  "statement"  in  a  high-level  language  is 
not  a  convenient  unit,  since  it  can  be  arbitrarily  long  and  complex. 
Thus,  it  is  reasonable  to  propose  the  execution  of  an  "operator" 
as  the  smallest  unit  in  such  a  language.  However,  the  "operators" 


I  have  in  mind  here  are  a  super-set  of  the  operators  usually 
defined  in  a  syntactic  description  of  the  language.  For  example, 
when  in  FORTRAN  one  writes 


IF (X-Y  >10,20,30 


the  selection  of  the  appropriate  case  as  a  result  of  evaluation 
of  (X-Y)  must  also  be  regarded  as  an  operator  as  well  as  the 
subtraction.  Similarly,  the  passing  of  parameters  in  a  subroutine 
call  must  also  qualify  as  an  operator.  Thus,  we  are  Jed  to  the 
intuitive  concept  of  the  set  of  operators  in  a  given  source 
language  S  as  "the  set  of  denotations  for  the  largest  operations 
in  S  whose  effect  is  indivisible  with  respect  to  the  semantics 
of  S".  This  is  a  very  vague  and  informal  definition,  in  which 
I  rely  on  the  intuitive  meanings  of  all  the  terms  used.  The  phrase 
"indivisible  with  respect  to  the  semantics  of  S"  requires  some 
more  elaboration  however.  Loosely,  what  is  meant  by  this  is  that 
the  effect  on  the  machine  state  of  an  "operator"  can  not  be  broken 
down  into  smaller  units  such  that  other  operators  can  be  inserted 
between  those  units.  These,  correspond,  in  a  conventional  machine, 
to  those  periods  in  the  execution  cycle  in  which  the  processor 
is  uninterruptible.  They  also  represent,  from  a  user's  point  of 
view,  the  finest  degree  of  detail  which  can  appear  in  a  user  re¬ 
quest  for  information. 


7.3  A  Mini  Demonstration  Language 


For  the  sake  of  a  more  concrete  illustration  and  also  to 
face  some  of  the  problems  which  arise  in  the  application  of  these 
ideas,  I  shall  define  a  very  small,  hypothetical  member  of  the 
ALGOL  family  of  languages  and  use  it  as  a  vehicle  to  explore  these 
ideas  further.  I  have  decided  to  take  this  approach  rather  than 
pick  a  particular  implementation  of  an  existing  language  since 
the  latter  would  very  likely  be  a  much  larger  task.  In  our  hypothe 
tical  language,  we  would  like  the  following  properties. 


(i)  It  should  capture  the  essence  of  the  syntax  and  semantics 
of  ALGOL-like  languages,  i.e.  features  common  to  ALGOL  60,  PL/I, 
BLISS,  etc.  These  include  ALGOL-like  syntax,  block  structure, 
recursion. 


(ii)  Its  syntax  should  be  definable  by  a  small  grammar  (say, 
about  2  pages), 

(iii)  Its  semantics,  similarly,  should  be  easily  describable, 
even  if  only  informally. 


he  mini-language  to  be  used  here  was  obtained  by  chopping 
away  a  major  part  of  BLISS/10,  in  fact  by  removing  a  great  deal 
of  its  unique  and  interesting  parts  (such  as  the  uniform  inter¬ 
pretation  of  names  as  addresses,  the  contents  operators,  the 
concept  of  "structures"  etc.)  ,  retaining  only  a  small  portion 
which  looks  sufficiently  like  ALGOL  or  PL/I  etc.  I  shall  refer 
to  this  language  as  the  Mini  Demonstration  Language  (MDL)  .  The 
syntax  of  MDL  is  given  in  an  appendix.  For  a  description  of 
its  semantics,  I  refer  the  reader  to  the  BI.ISS/10  manual  [WU  71] 

7.3.1  Information  Accessible  by  the  MDL  Analysis  Fa c i 1 i t 

The  outline  given  below  follows  the  one  given  in  Section  6.2: 

(i)  The  external  state  of  the  MDL  machine,  i.e.  the 
contents  of  every  variable  and  address  directly  accessible  by 
the  MDL  program, 

(ii)  Parts  of  the  internal  state  of  the  MDL  machine  con¬ 
taining  the  components  of  the  current  expression  being  evaluated, 

(iii)  The  text  of  the  MDL  program, 

(iv)  Information  about  the  structure  defined  over  the  user 
program  for  purposes  of  analysis  (which  may  not  always  coincide 
with  the  syntactic  structure), 

(v)  Empirical  data  associated  with  each  component  of  the 
structure,  collected  during  execution, 

(vi)  Control  flow  history, 

(vii)  The  text  of  the  analysis  program, 

(viii)  The  list  of  analysis  actions  associated  with  each 
contact  point, 

(ix)  Me t a-var i ab 1 e s  holding  intermediate  results  in  analysis 

computat ions  . 


Item  (i) ,  access  to  the  MDL  variables,  does  not  require  any 
more  elaboration.  Item  (ii)  and  an  important  extension  is  dis¬ 
cussed  in  detail  in  a  succeeding  subsection.  Items  (iii)  and  (vii), 
namely  access  to  the  texts  of  MDL  programs  and  analysis  programs 
are  discussed  together  in  a  succeeding  section.  Items  (iv),  (v) , 

(vi)  and  (viii)  are  discussed  together  in  the  next  subsection. 

Item  (ix)  is  discussed  in  the  section  on  data  types  for  the  analysis 
language . 


] 


--■3  •  1  •  1 - Representation  and  Accessing  of  MDL  Execution 

History  - - - 

historv^d^rfT^  jnformatlon  consists  basically  of  control  flow 
fo  7’  t  hlstory  and  a  mapping  between  these  two  his- 

"  ;  1  W*U  be  recalled  that  in  the  DAME  system  this  took  the 

each  of  wh  i'ch  6  '  ^ C 6  table  conta±ning  node  instance  descriptors, 
oblect  and  ?  ?  /  stained  pointers  to  the  associated  node 

data  m  a  r  rCrC  SetS’  33  Wel1  33  certain  other  dynamic 
'  Clearjv,  the  key  concepts  which  motivated  this  ir.plemen- 
ation  are  those  of  nodes,  node  instances  and  input/output  sets. 

input/outou^set31  T  °f  n°d6S  ’  the  -ode  instances 

nput/output  sets  and  an  execution  history  structured  in  the  above 

nodes  *  h”  direCtly  transportable  to  MDL.  The  rules  for  defining 
nodes,  however,  are  not  as  simple  as  in  the  case  of  low-level 

have  aeui?cfeage;  Wh6re  ^  0nly  rec*u±rement  was  that  each  node 

‘  entry  point  and  a  unique  exit  point.  In  MDL,  since 

find  -  ''unitg°f  analogue  of  a  "machine  instruction",  we  must 

aaHUnit  °f  execut]  on  (UOE),  which  will  in  fact  be  the  smallest 
syntactic  construct  which  can  be  designated  as  a  node. 

for  exPression-°rientation  of  MDL  suggests  a  natural  candidate 

3  10n  as  a  U0E:  thc  simplest  expressions  in  the  lan- 
f,  6  ’  eSe  ara:  a  name  ° r  a  decimal  integer.  However,  to 
these  we  must  add  an  element  which  can  act  as  a  unit  of  computa- 

can  ho  "amely’  7ressions  involving  a  single  "operator"  which 
IF-MEN  ^F°?HKN  a^thmetlc  ° r  Boolean  operators,  relations, 

operator  M  ft’  S  ELE£T-0: F-NS: ET ,  EXIT,  RETURN,  the  indexing 

Clearlv  c,’  H  routlne  call>  WHILE-DO  and  INCR-FPOM-TO-BY-DO . 
lat  in  "Ce  °niy  3  Slnyle  operator  is  to  be  involved,  the  two 

atter  loop  operators  can  only  appear  in  degenerate  form;  thev 

ther  loop  zero  times  or  an  infinite  number  of  times  or  they 
compute  a  value  already  given  as  an  operand  in  the  expression 
e.g.  INCR  I  FROM  J  TO  K  DO  Z  or  INCR  I  FROM  J  TO  K  DO  J  etc..’ 

(  hese  loop  expressions,  when  used  as  UOE’s,  always  return  -1 

c»n°beI^volv^e)Se"antiCS  °f  ^  ^  BLISS  Sl"Ce  EXIT  »»««« 

tn  thp1VGn  th?  !brVe  basic  definition  for  the  UOE,  the  extension 

exnrpss?eneral  definitlon  of  a  node  ^  very  natural:  namely,  any 
.  p  ession  sequence  in  the  language  having  a  unique  entry  point 

and  a  unique  exit  point.  This  corresponds  in  the  syntactic  specifi¬ 
cation  of  MDL  to  the  non-terminal  "expression  sequence",  with 

Possi^v  thf  J"  "0t  C°ntain  3ny  ^cape-expressions  except 

1  f ,tha.end  of  the  node-  This  selection  is  in  harmony 

r n m n  recrement  that  node  definitions  should  be 

compatible  with  the  scopes  of  routines  and  blocks:  it  forces  the 
satisfaction  of  that  requirement  automatically. 


115 


In  adapting  the  concept  of  input/output  sets  to  higher-level 
languages,  there  are  certain  issues  with  respect  to  the  represen¬ 
tation  of  the  elements  of  input/output  sets  which  must  be  resolved. 
I  shall  only  sketch  some  solutions  to  these  issues  here  since 
they  do  not  seem  to  be  major  problems. 


One  issue  is  the  representation  of  local  variables  in  I/O 
sets.  Consider,  for  example,  the  following  MDL  code  and  the  defined 
nodes  N1  and  N2: 


Routine  R=begin 


local  A, B ; 

Nl 

A  <  C [ 1  ]  +  2  ; 

B  -  D [ A ]  +  5  ; 

N  2 

Cl  A]  +  f (A, 

end  ; 

The  I/O  sets  of  N1  would  look  like: 

I  =  [ (C(l) ,V  ) , ( D [ V  ] , V  ) ] 

N1  12  3 

0  =  [  (  R  .  A  ,  V  )  ,  ( R  .  B  ,  V  )  ] 

N1  2  4 

The  I/O  sets  of  N  would  look  like: 


I  =[ ( R . A , V  )  ,  ( R . B , V  ) ] 
N2  2  4 

0  =[(C[R.AJ,V  )J. 

N2  5 


Here,  we  have  denoted  by  V  ,  i  =  l,...,5,  the  value  of  C  [  1  ]  , 

i 

Cl  1 J  +  2 ,  DCcClJ+2]  and  D  [  C  [  1 J + 2  ]  +  '<  in  Nl,  and  f ( A , B )  in  N 2 ,  respec¬ 
tively.  We  have  also  assumed  that  the  function  f  does  not 
reference  any  non-local  variables.  To  denote  the  use  of  the 
local  variables  A  and  B,  we  have  used  the  qualified  form  R.A  and 
R.B  where  R  is  the  name  of  the  routine  in  which  they  are  declared. 
To  qualify  local  variables  which  are  declared  in  the  inner  blocks 
of  a  routine,  one  could  employ  a  block-numbering  scheme  similar 
to  the  one  used  by  some  Algol  compilers.  In  such  a  scheme,  a 
qualifying  index  is  added  for  each  static  level  and  blocks  in  the 
same  static  level  are  numbered  sequentially.  For  example,  in  the 
skeletal  code  : 


116 


Routine  S=  begin 

local  A; 


begin 

local  A  ,  B  ; 


end  : 


begin 
local  A ; 


begin 
local  B ; 


p  2  A  R  2  1  B  H  represented  as:  R . A ,  R.l.A,  R.I.B. 

R-2.1.B.  However,  as  the  number  of  levels  increases 
this  notation  quickly  becomes  cumbersome.  To  overcome  this’  at 
the  expense  of  the  loss  of  some  information  in  the  notation’  we 

ordinaireV1hte  by  USing  °nly  °ne  index  which  is’the 

ordinal  number  of  the  block-head  where  tha  variable  is  declared 

without  any  reference  to  static  levels  Thu*  in  Iht  ?  ’ 

^ir1" 

that  while  some  information  is  lost  i  e  static  i  e 

no  ambiguity  arises  from  the  use  o  f  t  h  i  s  '  lltllr \ep  re se„"Cl  1 12  “°"! ' 

Similar,  but  less  problematic  issues  arise  with  resoact  to 

;  e:er  r  f»r.ai!  es. 

such  as  eboie  "*  reS°1Ved  by  ^  °£  a  <"■*“'!*««  -echanism 

qualify  in  b  III  T°Uti"e  calls  are  Permitted  in  MDL ,  another 

qualifying  mechanism  must  be  introduced  to  distinguish  amonp 

locfls  recursive  incarnations  of  the  same  routine  and  their 


117 


7.3. 1.2 


Access  to  the  Internal  State  and 

Generic  References  to  Expression  Sequences 


An  important  facility  that  an  analysis  facility  for  a  high- 

SoIeihJnrUke:mUSt  “  °ne  ”hlCh  Per"lts  the  user  t0 


"If  I  ever  do  X,  then  do  <action>" 


where  X  is  a  partial  syntactic  specification  of  an  expression 
sequence.  For  example,  X  may  be:  '<name>  <-  <name>+l’  which 
would  mean  that  <actio„>  will  be  performed  whenever  the  MDL  machine 
valuates  an  expression  whose  syntactic  form  fits  the  given  spe- 
“£l“I  ln  such  specifications,  the  permitted  non-te  rminals 

yntactic  definitions  must  be  those  given  in  the  syntactic 
definition  of  the  language,  or  they  must  be  specified  formally 
somewhere  (e.g.  user  manual)  accessible  by  the  user.  The  user 
may  also  be  given  a  device  for  new  non- t e rm ina 1 s  to  abbreviate 
possibly  long  syntactic  forms. 


E.g. 


Define  <suiri“terins>  >  <nanie>  <g>-f-<g>« 

Define  <prod.terms>  ->  <name>  «-  <e>*<e>; 

Define  <  sum-o  f -p  roduc  t  s  >  ->  <name>  -  <prod . term>+<p rod . term> ; 
Define  <p  rod  . -of -s  urns  >  ->  <name>  «-  < sum- t e rm > * < s urn- t e rm > ; 


If  I  ever  do  <sum-of -prod uc t s >  or  <prod . -of-sums >  then  do. 


svmbnl QCh  A  f 3 C 1 1 1 C ^  should  also  permit  the  use  of  terminal 
symbols  and  reserved  symbols  with  special  meanings,  e.g.  to 

indicate  relations  between  values  of  non-terminals.  For  example 


If  I  ever  do  'x  •>-  <name  >$l  +  <name  >$?*<name>$l  ’ 
then  do  <action>" 


would  trigger  taction-  whenever  a  value  computed  by  adding  the 
product  of  the  values  of  two  variables  to  the  value  of  one  of 
them  is  assigned  to  X. 


An  extension  to  the  facility  for  defining  new  non- t e rmi nal s 
leads  us  to  the  notion  of  "templates",  which  contain  "holes"  or 
formal  parameters.  For  example,  one  could  write: 


"Define  template  T(X,Y,Z)  -  ’if  y  then  Y  else  Z'; 


I  f 
If 


ever 

ever 


do 
d  o 


TC'A-'B',  'f(A)  1  ,  'f(B)  ')  then 


do , 


T(  (C+5)<0'  , <e> , <escapeexpression>) 


then  do 


. .  - 


- -  .,l,c  a,..  . .  >1 mi-.- w 


118 


15 


These  definitions  would  then  cause  the  system  to  watch  for 
the  expression  'if  A >B  then  f(A)  else  f(B)'  and  for  expressions 
of  the  form  'if  (C+5<0)  then  *-e  >  else  <es  capeexpr  ess  ion  > '  ,  and 
take  the  specified  actions  upon  their  occurrence. 

7. 3. 1.3  Access  to  FPL  and  MDLAF  Texts 

The  primary  reason  for  access  to  the  texts  of  MDL  and  MDLAF 
programs  is  the  desirability  of  on-line  editing  of  these  programs. 
The  user  should  not  have  to  terminate  the  analysis  session  to 
make  corrections  to  either  of  these  programs. 

A  second  reason  is  to  be  able  to  analyze,  under  program  control 
and  optimize  the  set  of  actions  associated  with  a  contact  point 
(e.g.  to  eliminate  redundancy,  to  determine  unintended  dependencies 
between  actions). 

A  third  reason  is  to  facilitate  the  specification  of  the 
user  of  expressions  which  are  to  be  monitored. 

Thus  there  seems  to  be  a  need  for  two  different  types  of 
editors;  one  is  the  more  conventional,  line  or  character-oriented 
editor  to  be  used  in  preparing  and  editing  of  MDL  and  MDLAF  texts; 
the  other  is  a  lexeme-oriented  editor  which  knows  the  syntax  of 
MDLAF  and  can  respond  to  requests  like: 

"If  there  are  any  assignment  operators  in  the  MDLAF  actions 
associated  with  fetches  from  X,  return  a  list  of  pointers  to 
those  actions,  else  return  a  null  list," 

or  , 

"Are  there  any  continuously-evaluated  MDLAF  expressions  involving 
Y?  If  so,  delete  them." 

It  is  clear  that  such  an  editor  will  have  to  know  the  syntax 
of  MDLAF  as  well  as  its  internal  representation  in  order  to  be 
able  to  find  the  desired  pointers,  delete  expressions  and  the  like. 


'.,'1  -I-  1  II W  J»M.»JI1I«!^."L»  H."  HJWiJRllll  ....111  .|r|flii*JW.«ii)  lli^,ilJ,WiJl"-«  UJI  J  U.K. •« 

$  '''v’" . ;  '  " ' 


119 


7.3,2  Contact  Points  and  Hook  Insertion  in  the  MDL 
Analysis  Facility 

Recalling  our  earlier  definition  of  contact  points  in  the 
context  of  low-level  machines,  as  "those  points  in  the  instruction 
cycle  at  which  the  analysis  facility  can  gain  control",  we  can 
translate  this  notion  to  the  domain  of  high-level  languages  such 
as  MDL  in  terms  of  the  unit  of  execution  which  we  have  selected, 
namely,  individual  operands  and  operators.  That  is  to  say,  the 
MDL  machine  will  check  for  any  required  monitoring  actions  after 
the  fetch  of  each  operand  of  an  operator,  just  prior  to  and  just 
after  the  application  of  the  operator,  and  just  before  the  storage 
of  the  result.  This  requires  that  we  specify  the  order  in  which 
the  checks  will  be  made  within  the  expression  involving  the  fetch 
of  several  operands.  It  seems  natural  that  this  order  should 
be  the  same  as  that  which  is  specified  in  the  language  for  evalua¬ 
tion  of  the  operands  of  expressions.  BLISS/10,  and  MDL,  give 
"no  guarantee  regarding  the  order  in  which  a  s imp leexp r e s s ion  is 
evaluated  other  than  that  provided  by  precedence  and  nesting... 
(BLISS  Reference  Manual,  Jan.  15,  1970,  p.  2.2b).  Hence,  in  the 

expression 

B  «-  (C  «-  3 )  +  ( A  «-  5) 

no  guarantee  is  made  about  which  of  the  two  p ar an t he s i z e d  expres¬ 
sions  is  evaluated  and  checked  for  hooks  first.  However,  it  is 
guaranteed  that  store-hooks  for  B  will  be  checked  after  both 
of  the  assignments  to  C  and  A.  I  shall  denote  store-hooks  and 
fetch-hooks  for  a  location  X  by  SHOOK(X)  and  FHOCK(X)  respectively. 

In  addition  to  the  store-hooks  associated  with  A,  B  and  C, 
general  hooks  associated  with  the  initiation  and  completion  of 
every  expression  evaluation,  which  I  shall  designate  by  IEXPHOOK 
and  CEXPHOOK  respectively,  and  hooks  for  the  initiation  and  comple¬ 
tion  of  each  specific  expression,  to  be  designated  by  ISEXPHOOK 
and  CSEXPHOOK  will  be  checked.  Thus,  the  sequence  of  actions  in 
the  evaluation  of  the  above  expression  will  be  as  follows  (Note: 
action  sequences  seperated  by  //  should  be  assumed  to  be  done  in 
random  order ) : 

( ( I EXPHOOK  ; 

ISEXPHOOK; 

SHOOK ( C ) ; 

C  «-  3; 

CSEXPHOOK ; 

CEXPHOOK) ; / / 


120 


( I  EX  P HOOK ; 
ISEXPHOOK; 
SHOOK (A) ; 

A  '  5; 

CS  EXPHOOK ; 
CEXPHOOK) ) ; 

I  EXPHOOK ; 
ISEXPHOOK : 
d  C  +  A; 

1 

OSEXPHOOK; 
CEXPPOOK  ; 
IEXPHOOK; 
ISEXPHOOK; 
SHOOK(B) ; 

B  '  d  ; 

1 

OSEXPHOOK; 
CEXPHOOK  ; 


a  general  hook  (i.e.  IEXPHOOK,  CEXPHOOK)  can  be  as  simple  a  a 

H,s  irrlr-;-- 

precer,  although  perhaps  not  prohibitive.  Therefnrr  rh* 

— — -A"  Outline  of  the  MDL  Analysis  Facility  La,eu.»  (AJL) 

AFL  is  an  extension  of  MDL  containing  a  new  data  tvne 
several  new  syntactic  constructs  and  a  set  of  bui  -  n  !■ 
and  reserved  words  t  n  t-  Vi  i  e  i  •  It  in  functions 

MDL  are  outlined  SubSect1D„,  these  extensions  to 


(i)  NOLE  Declaration 

Syntax:  NODE  nod e -d e c 1  a r a t i o n  list>: 

node-declaration  list-  ->  <node-decl . / 

,  ,  .  ,  node-declaration  lis t  - , <node-dec 1  .  > 

node-decl  .  :■  >■  -node  name  >  =  <d  e  1  im  i  t  e  r ' 


121 


<delimiter>  -»  <routine  name>  /  <label>  / 

<routine  name  > , <b lock  delimiter;-/ 

< 1 ab e 1 > , < b 1 o c k  delimiter> 

<block  delimiter^  ->  <integer>  / 

< i n t e g e r > . <b 1 o ck  delimiter^  / 

<block  delimiter>,<block  delimiter'--  / 
<block  d e 1 imi t e r > : < b 1 o ck  delimiter> 

Examples:  NODE  N1=R0UTINE1; 

NODE  N2= ROUTINE, 5; 

NODE  N3  =  L00P  ,  1 . 1 . 2 . 3 : 5  ; 

NODE  N4  =  LOOP , 1. 2:3, N5  =  LOOP,  1.4:7; 

Effect:  The  indices  which  are  not  followed  or  preceded  by 

a  :  ,  represent  lexical  levels  in  the  code  which  is  in  the 

scope  of  the  <routine  name>  or  <label>.  If  a  pair  <x>:<y>  is 
not  present,  the  entire  level  is  assumed,  otherwise  the  --x>th 
complete  expression  through  the  <  y  >  t  h  complete  expression  at  the 
level  of  the  last  index  is  defined  as  a  node  with  name  <node  name>. 
Nodes  must  be  disjoint  or  properly  nested.  Also,  they  must 

begin  and  end  in  the  same  level  and  block. 

(ii)  Built-in  Functions  and  Reserved  Words 

Locating  Nodes  and  Node  Instances 

Find  Node:  $FN(<node-decl>) 

If  no  node  has  been  defined  which  satisfies  <node-dec 1 > , 
returns  0  else  returns  the  address  of  the  first  such  node  object. 

Find  Node  Instance:  $FNI ( <node-inst .  spec>[<node-expr>]) 

<node-inst  .  spec>  ->  ( <  MDL  expression^) 

<node-expr>  -*■  <node  name>/<node  obj  .  ptr>/ 
$NODEOBJ(<node-inst  expr>) / 

@  < MDL  e xpr  > / $  CURNODE 

<node-ins  t  expr>  -  $CURINST/ $LA STINST ( <node-expr > 
[,<node-inst  expr>])/ 

$FIRSTINST(<node-expr>[ ,<node-inst  expr>])/ 
$NEXTINST(<node-inst  e  xp  r  >  T , <count> 

C  ,  <node-expr  >  1  ]  ) / 
$PRECINST(<node-inst>[ ,<count> 

[  ,  <node-expr>]])  / 


@  <MDL  expr  > 


I  f  t  e c  t  :  Let  the  value  of  '  node  -  i  ns  t  spec*  he  N .  J  f 

node-ex pr  has  been  specified,  then  only  the  instances  of  that 
node,  otherwise  all  node  instances  are  searched.  If  NO,  then 
the  direction  of  search  is  forward  in  time  starting  with  first 
node  instance;  if  N-0,  the  direction  of  search  is  backward  in 
time  starting  with  the  last  instance,  with  N=0  representing  the 
last  instance.  If  the  value  of  <  n  o  d  e - e  x  p  r  >  is  zero,  then  zero 
is  returned;  otherwise  the  value  is  taken  as  the  address  of  a 
n  o  d  e  -  o  b  ]  e  c  t  . 

Node  Object  of:  SNODKOBJ (■ node- ins t  expr-) 

It  ■■  node- ins  t  expr  -  points  to  a  node  instance,  then  t  he¬ 
address  of  the  node  object  associated  with  that  instance,  otherwise 
zero  is  returned. 

Last  Instance  of:  $1A  S  TINS  T  ( ■  node  expr -•  I,  node-inst  expr  I) 

If  second  argument  is  omitted,  it  is  equivalent  to 
SI-NI  (0 ,-  node -expr  ■)  ,  otherwise  a  pointer  to  the  last  instance  of 
node-ex pr  prior  to  node-inst  expr-  is  returned.  If  no  such 
instance  can  be  found,  zero  is  returned. 

Mr  st  Instance  of:  $  FIRST  I  NS  T(  -  node-expr  I,  node-inst  expr  i) 

If  second  argument  is  omitted,  it  is  equivalent  to 
$FNI(1,- node-expr-).  Otherwise  a  pointer  to  the  first  instance 
of  node-expr-  after  -node-inst  expr-  is  returned.  If  no  such 
instance  can  be  found,  zero  is  returned. 

Next  Instance  of:  $NEXTINST  ( -node-inst  expr  >  T  ,«•  count  > 

I  ,  --node-expr  ]  i ) 

If  the  2.  and  3.  arguments  are  omitted,  it  is  equivalent  to 
$11 PSTINST($NODEOBJ(<  node-inst  expr>),  <node-inst  expr>).  If 
only  the  3.  argument  is  omitted,  it  is  equivalent  to  FNI ( -count >+] , 
$N0DE0BJ ( -node-inst  expr>)).  Otherwise  a  pointer  to  the  nth 
instance  of  --node-expr-  after  --node-inst  expr>,  where  n=<count>, 
is  returned.  If  no  such  instance  can  be  found,  0  is  returned. 

Preceding  Instance  of  :  $PPECINST(<node-inst  expr>l  ,<count;- 
[,-■  node-expr  >. I  ]) 

If  the  2.  and  3.  arguments  are  omitted,  it  is  equivalent  to 
$LAS T INS T ( $ NO DEOB J ( - no d e - i ns t  expr>),  ^node-inst  expr>).  If 
only  the  3.  argument  is  omitted,  it  is  equivalent  to  FN 1 ( < c o un t >- 1 , 
$N0DE0BJ( --node-inst  expr-)).  Otherwise  a  pointer  to  nth  previous 
instance  of  <node-expr>  relative  to  -node-inst  expr>  is  returned. 

If  no  such  instance  can  be  found,  0  is  returned. 


Current  Node  Instance:  $CURINST 

A  global  variable  which  always  points  to  the  node  instance 
which  was  entered  most  recently. 

Current  Node  Objec'  :  $CUROBJ 

Equivalent  to  $NODEOB J  (  $CURINST)  . 

Locating  Input/Output  Sets 

Input  Set  List  of  Node:  $ISL(<node-expr>) 

Returns  a  pointer  to  Input  Set  List  of  node  <node-expr>  or. 
in  case  of  errors,  zero. 

Output  Set  List  of  Node:  $OSL ( < nod e -e xp r > ) 

Analogous  to  $ISL. 

Input  Set  of  Node  Instance:  $IS(<node-inst  exprv) 

Returns  a  pointer  to  input  set  of  <node-inst  expr>,  or,  in 
case  or  errors,  zero. 

Output  Set  of  Node  Instance:  $0S ( <node-inst  expr>), 

Analogous  to  $  IS . 

Accessing  Values  of  Addresses  in  I/O  Sets 

Value-part  of  I/O  set  elemert  :  $ VAL ( < id -e xpr >  ,  < I / 0  set  ptr 

<f lag  >) 

<id-expr>  -  <name > / < r ou t ine  name > , <name > / <r ou t ine  name>, 
<block  id> , <name  > 

<block  id>  -*■  <pos.  decimal  >/<b]ock  id>,<pos.  decimal> 

Indirect  Addressing  Opera'  or:  ^name-' 

Returns  a  pointer  to  the  object  whose  address  is  equal  to 
the  value  of  <name>. 

Get  Attribute  Value:  $GATTR(<obj.  expr>,<attr.  name>, 

•"flag  v  a  r  > ) 

Looks  for  an  attribute  named  <attr.  name>  in  the  object 
<obj.  expr>.  If  such  an  attribute  is  not  found,  <flag  var>  is 


124 


set  to  zero  and  zero  is  returned.  Otherwise  <flag  var>  is  set 
to  1  and  the  value  of  the  attribute  is  returned. 

Add  Attribute:  $ATTR(<obj.  expr>,<attr.  name >,< value > ) 

Change  Attribute':  $CATTR(<obj.  expr>,<attr.  name>  ,  <value>) 

Delete  Attribute:  $DATTR(<obj.  expr>,<attr.  name>) 

These  functions  work  in  obvious  ways.  They  return  1  if 
successful,  0  otherwise. 

In  addition  to  these  functions,  a  set  of  conventional  list 
processing  functions  such  as  create-list,  in c 1 ud e- in- 1 i s t  ,  remove- 
object  - f rom-1 is t ,  head -o f - 1 i s t ,  t a i 1 -o f - 1 i s t ,  c a r d ina 1 i t y-of -1 is t 
etc.  should  be  provided. 

(iii)  Editing  MDL  and  AFL  Texts 

I  shall  comment  only  briefly  about  this  aspect  of  the  analysis 
facility.  The  need  for  two  different  types  of  editing  abilities 
has  been  noted  in  Chapter  6.  One  of  these  is  the  normal  set  of 
functions  provided  with  conventional  on-line  text  editors  for 
preparing  and  modifying  program  text.  The  other,  and  the  more 
interesting  one  for  our  purposes,  is  a  lexeme-oriented,  rather 
than  character  or  1 i n e -o r ien t ed  ,  editor  which  can  work  on  list- 
structure  representations  of  MDL  and  AFL  parse  trees.  Considerable 
work  has  been  done  on  such  a  syntax-driven  editor  by  L.  Robinson 
and  D.  Parnas  ( [PP  73]  and  [ Ro  73]).  I  feel  I  can  do  nothing 
better  than  cite  these  references  here. 

(iv)  Explicit  Hook  Insertion 

(iv  a)  Monitoring  of  Accesses  to  Variables 

To  monitor  the  accesses  to  a  variable  explicitly,  AFL 
contains  the  ON  FETCH,  ON  STORE  and  ON  USE  facilities,  whose 
syntax  is: 

<hook  name>:  ON  <condition><varlist>  DO  <expr>; 

or  , 

^ hook  name>:  ON  <condition>  DO  <expr>; 

<condition>  ->  F  ETCH/ STORE/ US  E 

<var]  ist>  -*•  < MD L  variable  id>/ 

<MDL  array  id>[<index  expr>]/ 

<varlist>,<varlist> 


125 


For  example,  "H:  ON  FETCH  X  DO  expr  "  will  cause  the 
evaluation  of  expr  whenever  the  contents  of  X  are  fetched  in 
the  evaluation  of  some  MDL  expression.  If  X  is  omitted,  expr 
will  be  evaluated  with  the  fetch  of  every  operand  of  every 
expression.  $OPDADDR  and  $0PDVAL  will  contain  the  address  and 
the  value  of  the  current  operand. 

ON  STORE  and  ON  STOP. E  X  work  similarly,  except  that  they 
are  checked  prior  to  store  operations. 

ON  LSE  and  ON  USE  X  cause  checking  upon  both  fetch  and  store 
operations . 

(iv  b)  Monitoring  of  Expressions 

Expressions  to  be  monitored  can  be  specified  in  one  of  two 
ways:  by  lexical  location  or  by  giving  the  syntax  of  the  expression. 

Further,  the  monitoring  actions  can  be  specified  to  be  taken  just 
before  the  application  of  the  "root"  operator  of  the  MDL  expression 
orjustafterit. 

Specification  by  Lexical  Location: 

<hook  name>:  BEFORE  <MDL  location  list>  DO  <expr> 

<hook  name  > :  AFTER  <MDL  location  list>  DO  <expr> 

<MDL  location  list>  ->  <MDL  location>/ 

<MDL  location  list>,<MDL  location> 

<MDL  location>  •+  <delimiter> 

(See  the  syntax  of  the  non-terminal  <delimiter>  in  the 
second  paragraph  of  7.3.3) 

Examp les  : 

L:  BEFORE  P.OUTINEl,  R0UTINE3  DO(X  «-  Y+l  :  TYPE  (X)  )  ; 

LI:  AFTER  LABEL1.1.3  DO  $DISAB(L); 

L2 :  AFTER  LABEL1. 1.3:5  DO  $TYPE(Z); 

The  first  example  will  cause  the  paranthesized  seouence  of 
expressions  to  be  evaluated  before  every  call  on  the  routines 
R0UTINE1  and  ROUT INE3 ,  after  their  parameters,  if  any,  have  been 
evaluated . 


126 


,h,rHThe  second  example  will  disable  the  above  action  after  the 

and  LetLrTi0nJn  V"}  flrSt  bl°Ck  (°r  expression)  following 

the  same  block  (or  compound  expression)  and  level  as  LABEI.l . 

out  6Xfmple  wil1  cause  the  value  of  7  to  be  typed 

at  tie ton  T  eyal;at;onkof  each  of  the  3.,  A.  and  5.  expressions 
above!  (°r  Compound  expression)  mentioned 

Specification  by  Syntax: 

<hook  name-:  BEFORE  EACH  <syntax  spec.>  DO  <expr>; 

<hook  name  > :  AFTER  EACH  <syntax  spec.>  DO  <expr>; 

'syntax  spec.>  takes  the  form  of  an  expression  in  which 
non-terminal  symbols  enclosed  within  <,>  or  the  special  symbol 
$  (it  means  anything  )  may  appear. 

Examples  : 

L:  BEFORE  EACH  <  loopex p r e s s i on >  DO  <expr  ; 

LI.  AFTER  EACH  $*+A  DO  <expr>; 

,  f  Th®first  sample  would  cause  the  evaluation  of  <expr> 
before  the  evaluation  of  any  WHILE  and  INCR  expressions.  The 
second  example  would  cause  the  evaluation  of  <expr>  after  each 
addition  operation  involving  A  as  the  right-hand  operand! 

j  i  The  ®yntactic  specification  facility  could  be  extended  bv 

non-terminalsl^di ^eatur®s  (involvlng  "templates”  and  user-defined 
dwell  discussed  in  subsection  7. 3. 1.2.  I  shall  not 

duell  on  these  extensions  here. 

(iv  c)  Monitoring  of  the  Control  Path 

in  facilities  for  monitoring  the  flow  of  control. 

These  air"  BEFORE  and  AFTER  features  described  earlier. 

hook  name  > :  ALONG  PATH  <path  descriptor  DO  <expr>; 


<hook  name  > :  AFTER  PATH  <path  descriptor  DO  .expr>; 


path  descriptor  >  <delimiter>/ 
<path  descriptor>,^delimiter> 


127 


<delimiter>  -►  <unit>/<unit  >(<count>)  /<unit  > 
r<path  descriptor >] 

<unit>  ->  <routine  name  >  /  <  lab  e  1  >  /  <  no  de  name> 

The  first  expression  causes  <path  descriptors  which  con¬ 
tains,  say,  n  ^delimiters >  ,  to  be  matched  continuously  against 
the  control  path.  If,  for  some  k<n,  the  first  k  elements  of 
<path  descriptor^  match  the  most  recent  k  elements  of  the  control 
path  (which  are  routine  or  node  names  or  labels),  then  <expr^ 
is  evaluated. 

The  second  expression  causes  <expr>  to  be  evaluated  only 
at  the  completion  of  the  specified  path. 

In  the  specification  of  <delimiter>,  the  option  <-unit>(<count>) 
means  that  <count>  number  of  consecutive  executions  of  the  same 
<unit>,  without  the  intervention  of  any  other  unit  s,  is  to  be 
watched  for  and  treated  as  a  single  element  in  the  path.  The 
option  [<path  descriptors  provides  for  nesting  of  paths. 

Examples : 

L:  ALONG  PATH  F0UTUR0UT2,  LABEL1 ,  LOOP!],  R0UT3  DO  <expr>; 

L:  AFTER  PATH  R1  (  2  )  ,  R  2  [  R3  (  4  )  ,  R4  [  R5  ,  R6  ]  J  (  3  )  DO  ^exprs 

In  the  first  example,  <expr>  will  be  evaluated  after  the 
execution  of  each  of  R0UT2 ,  LABEL1  and  L00P1  inside  ROUT1,  after 
exit  from  such  an  execution  of  ROUT!  and  after  P0UT3,  provided 
they  occur  in  that  order  with  no  intervening  <unit>s. 

In  the  second  example,  the  interpretation  is  similar,  except 
that  multiple  consecutive  executions  of  certain  < unites  are  to 
be  considered  as  single  elements. 

(v)  Continuously  Evaluated  Expressions 

CS  ELECT  <elist>  OF  CSET  < c exp r e s s ions e t >  TESC 

<cexpressionse  t>  -*•  /  <•  c  e  >  / 

<cexpressionset  > ; <ce> 

<ce>  -*■  <MDL  expressions  ^AFL  expression> 

<elist>  -*■  A  FL  expression^/  <elist>,<AFL  expression> 


As  will  be  obvious  to  those  familiar  with  BLISS,  this  syntax 
follows  the  syntax  of  the  SELECT  expression  in  BLISS,  and  hence 
the  expressions  defined  by  it  are  called  CSELFCT  (for  "Continuous 
Select")  expressions.  Its  evaluation  can  be  precisely  described 
by  saying  that  it  is  equivalent  to  the  evaluation  of  the  AFL 
expression  "SELECT  <elist>  OF  NSET  <cexpressionse t >  TESN "  after 
each  change  in  the  value  of  anv  MDI.  variable  in  <elist->  or  in 
the  left-hand  sides  of  <cexpressionset>. 

Examp  1 e : 

CS  ELECT  (D+E)  OF  CSET 

A  -  B  :  f  ; 

1 

A*  B : f  ; 

2 

C+D  :  f 

3 

TESC  ; 

This  example  will  cause  the  monitoring  of  the  values  of 
A , B , C , D  and  E  and  the  continuous  updating  of  the  values  of  the 
expressions  D+E,  A-B,  A*B  and  C+D.  When  the  value  of  D+E  changes, 
the  value  of  the  first  left-hand  expression,  A-B,  is  compared 
with  the  new  value  of  D+E.  If  the  two  values  are  found  equal, 
then  the  expression  f  is  evaluated.  Then,  the  next  left-hand 

1 

expression,  A.*B,  is  compared  with  D+E,  and  if  equal,  f  is  evalua- 

2 

ted.  This  process  continues,  until  all  left-hand  expressions  have 
been  tested. 

Important  note:  if  the  value  of  a  left-hand  side  is  equal 

to  the  value  of  the  controlling  expression  (D+E  in  this  example), 
the  right-hand  side  will  be  evaluated  with  each  change  in  the 
value  of  an  MDL  variable,  until  the  values  of  the  left-hand  side 
and  the  controlling  expression  become  unequal. 


129 


CHAPTER  8 


ARCHITECTURAL  FEATURES  FOR  EXECUTION  ANALYSIS 


As  has  been  previously  noted,  one  of  the  major  impediments 
to  the  wide  use  of  the  kinds  of  Simula  to r-based  techniques 
described  so  far  is  the  slowness  of  simulation  at  the  memory 
cycle  level  and  the  information  loss  incurred  with  simulation 
at  instruction  level.  Further,  if,  unlike  in  the  DAME  system, 
the  object  machine  and  the  host  machine  are  the  same,  then  one 
would  like  to  be  able  to  execute  the  uninteresting  parts  of  the 
program,  i.e.  the  parts  we  do  not  wish  to  include  in  the  analysis, 
at  full  machine  speed  and  only  incur  overhead  over  the  monitored 
parts.  This  becomes  an  important  factor,  for  example,  in  the 
case  of  trying  to  isolate  a  bug  which  appears  only  after  a  con¬ 
siderable  amount  of  execution. 

To  deal  with  this  problem,  we  have  to  design  architectural 
features  to  be  implemented  in  hardware  or  microprogram  which 
would  significantly  reduce  the  amount  of  monitoring  done  by 
software.  Thus,  in  this  chapter,  I  shall  discuss: 

(i)  Various  techniques  for  the  implementation  of  the  hook 
mechanism  as  a  function  of  the  relative  word  lengths  of  the  host 
machine  (K  )  and  the  object  machine  (W  ), 

H  0 

(ii)  The  implementation  of  the  node  mechanism,  in  particu¬ 
lar  the  node  objects,  the  node  trace  table  and  the  input/output 
sets,  along  with  the  types  of  storage  technologies  appropriate 
for  these  data  structures, 

(iii)  The  interface  between  the  host  machine  and  the 
object  machine,  in  particular  the  data  paths  and  the  control 
paths  between  the  two. 

I  shall  then  conclude  the  chapter  with  an  outline  of  a  unified 
architecture  embodying  the  various  features  discussed,  assuming 
a  simple,  conventional  CPU  architecture  for  the  object  machine, 
and  a  review  of  several  reports  on  hardware  and  microprogrammed 
measuring  and  monitoring  facilities  by  other  workers. 


130 


8.1  The  Hook  Mechani  si 


The  three  operations  which  lie  at  the  heart  of  any  monito¬ 
ring  scheme  are:  (i)  Given  a  particular  set  of  contact  points 
in  the  course  of  the  execution  of  the  object  machine,  the  deter¬ 
mination  of  whether  there  is  any  monitoring  action  to  be  taken 
at  the  current  contact  point,  (ii)  If  so,  locating  the  description 
of  the  action  to  be  taken,  (iii)  Taking  the  desired  action. 

Step  (i)  clearly  has  to  be  done  continuously,  i.e.  at  every 
occurrence  of  a  contact  point.  This  is  the  basic  price  paid 
for  running  on  a  monitored  machine.  Therefore,  it  is  desirable 
to  minimize  this  overhead.  Step  (ii)  is  normally  performed  much 
less  frequently  than  step  (i).  Thus,  in  programs  which  are  not 
heavily  monitored  this  step  will  not  normally  cause  excessive 
amounts  of  overhead.  In  heavily  monitored  programs  however,  this 
step  can  cause  sufficient  degradation  of  performance  to  prevent 
wide  spread  use  of  the  monitoring  facility.  The  amount  of  over¬ 
head  caused  by  step (i  ii)  ,  of  course,  is  a  direct  function  of  the 
pa  ticular  actions  to  be  taken  and  of  their  execution  by  the  ana- 
>  s  1  S  ac 1 1 1 1  y ■  In  the  rest  of  this  chapter,  I  shall  explore 
several  techniques  for  implementing  these  operations  in  conventional 
single-instruction -stream /sin gle-data-stream  processors.  For  this 

purpose,  let  us  distinguish  three  cases: 

.  (1)  The  ho8t  machine  has  a  longer  word-length  than  the 

object  machine  (W  W  ), 

1!  0 


(ii)  I  he  word  lengths  of  the  two  machine  are  equal  (W  =W  ), 

H  0 

(iii)  The  host  machine  has  a  shorter  word-length  (W  <  V  ). 

H  0 

8.1.1  Monitoring  wi  th  W  Greater  than  V 

H  0 

As  already  discussed  in  Section  3.2,  the  availability  of 
extra  bits  in  the  host  machine  word  greatly  facilitates  the 
monitoring  operations  mentioned  above. 


Machine  architectures  with  this  feature  are  also  known  as 
lagged  Architectures'.  Many  applications  of  this  architecture, 
including  some  which  are  not  discussed  in  this  thesis,  were  dis¬ 
cussed  by  E.  A.  Feustel  in  his  paper  "On  the  Advantages  of  Tagged 
Architecture"  (  f  Fe  73]). 


131 


Depending  on  the  number,  n,  of  extra  bits  available,  one 
can  use  them  as  flags  (say,  with  l<n<8)  or  as  indices  into  a 

table  ( y^n^ A  ),  or  as  an  address  in  the  address  space  of  the 
H 


t 

i 


host  machine  (n^A  ),  where  A  is  the  width  of  a  host  machine 

H  H 

purpose*  Let  US’  then’  first  consider  the  use  of  flags  for  this 
.".Flag-bit"  Implementation 


One  approach  may  be  as  follows:  one  designates  a  flag  bit 
eac i  ind  of  contact  point  applicable  to  addresses  (as  opposed 
to  general  contact  points).  These  contact  points  may  be,  for 
example,  designated  as  in  the  DAME  system:  namely,  (i)  after 

fetch  f/tC^’  fli}  b6f0re  evei'y  store,  (Hi)  after  every  instruction 
tetch,  iv)  after  every  instruction  completion.  In  this  case, 

one  would  need  four  bits.  If  fewer  than  four  bits  are  available. 

,  °"e,C^  combine  some  of  the  flags  and  implement  a  flag  in 

he  CPU  indicating  the  type  of  operation  currently  being  performed, 
f  Caf?’  the  monltorin8  logic  would  test  the  conjunction 

L  rPT,  38  b1^  ln  the  current  word  being  fetched  or  stored,  and 
the  CPU  operation  flag.  For  example,  with  three  bits  instead 
ot  four,  one  can  combine  the  fetch  and  the  instruction-fetch 

ag  bits.  There  would  be  a  bit  (let  us  call  it  the  I-bit),  indi¬ 
cating  whether  or  not  the  current  fetch  cycle  is  an  instruction 
fetch  cycle  or  a  data  fetch  cycle.  This  bit  has  to  be  accessible 
y  ie  monitor  routines.  Then,  the  user  who  wishes  to  detect 
tie  accesses  to  a  particular  location  as  an  ins t r u c t  ion- f e t c h , 
would  insert  a  hook  to  be  activated  upon  every  fetch  from  that 
location,  and  within  that  hook,  test  the  I-bit  to  determine  if 
the  current  access  is  an  instruction  fetch  or  not.  Since,  usually, 
the  same  word  is  not  accessed  both  as  data  and  as  instruction, 
this  technique  would  involve  a  conjunction  and  a  comparison  as 
an  overhead  only  in  fetches  from  locations  containing  an  instruc¬ 
tion.  This  does  not  seem  to  be  an  excessive  price  to  pay.  In 
fact,  if  one  is  sure  that  the  location  being  hooked  is  alwavs 
accessed  properly,  one  can  eliminate  this  test  altogether.  This 
will  probably  be  the  most  common  case. 


A  problem  arises  in  certain  computers  however,  if  one  wishes 
to  insert  a  hook  in  every  instruction  word  of  a  large  block  of 
consecutive  locations.  A  case  in  point  is  the  PDP-11,  which 
contains  in-line  data  interspersed  with  instructions  involving 
certain  addressing  modes.  Since,  in  general,  it  is  impossible 
to  tel]  statically  if  a  particular  word  contains  data  or  instruction 
the  insertion  of  hooks  only  in  locations  containing  instructions 
can  not  be  mechanized,  i.e.  the  user  has  to  either  hook  each 
instruction  word  individually,  or  hook  all  the  locations  in  a 
given  block  (using  a  mechanism  similar  to  the  DAME  HOOK  command 


132 


which  accepts  an  address  range  as  a  parameter)  and  then  go  in 
and  delete  the  hooks  for  individual  locations  containing  in-line 
data.  Either  way,  it  is  a  fairly  painful  process.  An  easier 
method  would  be  to  perform  a  test  in  the  monitor  routine  to  see 
if  the  current  cycle  is  an  instruction  fetch  cycle  or  not. 

Thus,  assuming  that  through  the  use  of  some  combination  of 
flag  bits,  the  presence  of  some  monitor  action  to  be  taken  at 
a  contact  point  can  be  determined,  let  us  now  consider  the  problem 
of  locating  the  description  of  the  monitor  action  to  be  taken. 
Ignoring  the  format  and  syntax  of  monitor  actions  themselves,  I 
shall  assume  that  a  single  pointer  is  sufficient  to  locate  the 
desired  action  description.  Hence,  what  is  needed  is  a  table 
look-up  procedure  with  two  inputs,  the  current  address  and  cycle 
(i.e.  instruction  or  data),  and  one  output,  a  pointer  to  the 
desired  monitor  action  description.  I  shall  not  elaborate  on 
the  implementation  of  this  procedure:  two  most  obvious  approaches 

which  come  to  mind  are  via  an  associative  memory  or  via  a  micro- 
pro  g rammed  table-lookup  mechanism. 

"Table  Index"  Implementation 

Let  us  now  consider  the  case  where  W  is  sufficiently  larger 

H 

than  W  to  permit  the  insertion  of  an  index  for  a  table,  M,  into 
0 

each  host  machine  word  representing,  an  object  machine  word,  in 
addition  to,  or  instead  of,  the  flag  bits.  In  this  case,  each 
entry  in  table  M  would  contain  either  the  description  of  the 
action  itself,  or  a  pointer  to  it.  Thus,  we  would  not  need  an 
associative  memory  or  microprogrammed  look-up  procedure,  since 
the  table  index  would  be  built  into  each  host  machine  word. 

The  limitation  of  this  approach  of  course  is  that  if  there 
arek  bits  available  to  be  used  as  an  index,  one  could  have  at  most 

a  2^  element  d i r e c t -ac c es s  table.  Such  a  table  could  be  extended 
by  chaining  overflow  areas  to  each  entry  etc.  at  the  cost  of  some 
more  search. 

"Full  Pointer"  Implementation 

If  the  number  of  bits  available  is  greater  than  or  equal  to 
the  address  width  of  the  host  machine,  then  one  can  in  fact  store 
there  the  full  address  of  the  monitor  action  description.  This 
eliminates  the  need  for  a  p re-allocated  table  to  contain  the 
action  descriptions  or  the  pointers  to  them.  It  permits  a  list- 
oriented  structure  to  be  created  and  maintained  dynamically. 

(As  will  be  recalled  from  Chapter  3,  DAME  goes  one  step  further 


and  creates  a  general  list  of  "interesting  objects"  for  each 
location  requiring  one,  e.g.  such  locations  as  node  entry  points, 
or  addresses  whose  previous  values  are  being  collected.  Pointers 
to  monitor  actions,  i.e.  hook  objects,  are  simply  inserted  and 
deleted  as  elements  in  these  lists  as  required.) 

8.1.2  Monitoring  with  W  Equal  to  W 
. . . . . . .  . .  H  0 

This  includes  the  important  special  case  where  the  object 
machine  and  the  host  machine  are  the  same.  Hence,  it  will  be 
discussed  in  some  detail. 

Fere  we  have,  for  each  memory  access,  two  pieces  of  informa- 
t  i L  Ti  with  which  to  determine  whether  or  not  the  address  being 
accessed  is  being  monitored  and,  if  so,  to  locate  the  monitor 
action  description:  namely,  the  object  machine  address  and  its 
contents.  A  technique,  of  obtaining  this  information  by  using 
only  the  address  has  already  been  discussed  above.  Another  technique 
which  uses  both  the  contents  of  the  accessed  address  and  the 
address  itself,  called  "Lambda  monitoring"  [LA  72],  was  described  in 
Section  3.7.1.  I  shall  summarize  this  technique  here  again.  The 
Lambda  monitoring  technique  relies  on  finding  a  bit  pattern, 

Lambda,  which  is  expected  to  be  used  very  rarely  by  any  object 
program  as  instruction,  address  or  data.  Lambda  can  be  determined 
by  the  user  at  load-time  (if  he  wishes  to  use  a  different  pattern 
than  the  default  one)  and  kept  by  the  system  in  a  Pattern  Register. 
Each  data  element  fetched  from  the  main  memory  or  a  register 
would  be  compared  with  Lambda  and  a  monitor  trap  would  be  caused 
whenever  an  object  machine  location  containing  that  pattern  is 
accessed.  Clearly,  this  operation  should  be  quite  transparent 
to  the  user  program  and  the  actual  contents  of  that  address  should 
be  made  available  to  the  user  program  by  the  control  logic  upon 
completion  of  the  monitor  action.  Once  a  monitor  trap  is  detected, 
one  then  has  to  locate  the  associated  monitor  action  description. 

For  this  purpose,  again,  the  object  machine  address  being  accessed 
could  be  used  as  an  input  into  an  associative  memory  or  microprog¬ 
rammed  look-up  procedure  to  obtain  a  pointer  to  the  monitor  action 
description.  If  the  bit  pattern  Lambda  is  the  actual  contents 
of  the  accessed  address,  then  the  table  search  mechanism  would 
return  a  "no-hit"  code  which  would  terminate  the  trap.  One  can 
generalize  this  technique  somewhat  by  defining  several  bit  patterns, 
to  be  kept  in  different  pattern  registers,  indicating  different 
kinds  of  moni to r t raps ,  e.g.  one  for  each  hook  type,  provided 
one  can  find  several  patterns  which  are  likely  to  be  used  very 
infrequently  by  user  programs.  This  would  enable  one  to  search 


a  unique,  and  hence  smaller,  table  for  each  such  bit  pattern. 


1 3  A 


firstT?Lh3in  P°tentiaJ  difference  in  the  performance  of  the 
table  an/Te,’  1La‘  lookin^  UP  every  generated  address  in  a 
teria-  n  '  Lam^«  Monitoring  technique  depends  on  two  cri- 

WtJ  the  iormaT  "k  th*  l0°k-Up  P  — dure  can  be  overlappe 

th  the  normal  object  machine  operand  fetch  procedure,  and 

'  „  "  l ;!,b|5  M“”»  L-M.  is  used  ^by  the  u^er  rogr.m 

"operand  s^™"  .  the  termS  llke  "°Perand  fetch"  and 

they  refer  tn.  lnterPreted  very  liberally  and  that 

by  the  user  proerlm  reJister  and  maln  memory  location  addressable 
current  centric  *"  r e* 1  » t e r -o r i en t e d  machines  (as  most 

current  central  processors  are),  most  references  are  to  registers 

procedure3  mentio'^H  Hence,  the  overlapping  of  the  table  look-up 
procedure,  mentioned  in  criterion  M\  H 

the  <=  n-r- f- „ -  „  c  ,  „„  _ ^  ”  vi'  aiuwve,  x  s  With  respect  to 

If  (.i  .  °ne  °f  the  CPU  cycle  and  the  main  memory  read  time. 

technique  isaanfttaKtial  “  — lapPad  P-tion,  then  the  firs^ 
s c h e m e q  n  f-?Pt  u  be  nUCh  slower  than  the  Lambda  Monitoring 

■  „  "  thc  other  ha"<*.  if  bit  pattern  Lambda  is  poorly 

chosen  and  crops  up  often  in  the  user  program  as  data,  address 
instruction,  then  the  overhead  associated  with  the  latter 
scheme  can  approach  that  of  the  former. 

envisione^^TM  /  C h & & 6  tW°  technic?ues»  a  third  one  can  be 
feature  whi  h  S  technic^ue  essentially  is  an  architectural 
structure  f  codification  of  the  physical  addressing 

length  H  1  °,jec  nacbine  to  make  the  host  machine  word 

ngth  ^  longer  than  the  object  machine  word  length  K  ,  in  a 

cal  lid'  iS.tranS,"ent  t3  user  P  r0Srams ,  i.e.  retaining  the  logi- 

:;mory  av:i U^rZJd'b;  r^d^Pb^I^  ^d^"1 

(1)  norma  l  user  p r oduj  t  illl- r un ‘nod”  ,  "  ( 1 J)  °a"a? Js Is  ' moSe "“n”  ' 
he  normal  mode,  the  machine  functions  with  no  changes  to  the 
addressing  structure  and  instruction  execution.  In  the  analysis 

and  theWeV^’  GVery  User-«enera  ted  address  is  muitip  d 
and  the  contents  of  the  double-word  at  that  address  is  taken 

user  isetryiil  to  d°ubl*-word  represents  the  word  which  the 

ser  is  trying  to  access,  and  the  upper  half  holds  a  pointer  to 

the  monitor  action  description,  much  like  in  the  "full  pointer" 

L  hat  ?lscussed  f"  fb-a  preceding  sub-section.  ^ese 

halves  can  be  retrieved  either  sequentially,  using  the  same 

"""orSrt,T°r  13  parallel  “sl"S  -  seperate  memory  po^rr  f„r  "he 
rh  .  ‘  In  aither  case,  the  monitoring  facility  would  pick  up 

he  monitor  word  and  perform  the  described  action,  if  any/  This 

or  thT6  “a3"63  °“  h3lf  °f  th£  St0ra^e  of  object  marine 

for  the  avoidance  of  a  table-lookup  procedure,  by  in  effect  using 

and  thermnt/  JeCt  machlne  (0M)  address  to  locate  the  OM  word 
nd  the  monitor  action  address  simultaneously.  Hence  it  may  be 

an  attractive  alternative  in  cases  where  a  gLt  majority  oJ  the 

t™/°  6  analyzed  squire  less  than  half  the  available 

t  o  rag  e  for  program  code  and  data. 


A  refinement  of  this  technique,  requiring  a  little  more 

implemented  in  the  hardware  or  microcode  controlling 
address  generation  Inside  the  CPU,  permits  user  control  over 

reductions  In,  the  amount  of  extra  storage 
required.  Let  us  suppose,  for  example,  that  the  area  of  main 

Ioveryb“™d  ."and”  ‘ntC5est'd  1  "  "oUtoting  lies  between  the 
would  muntnJv  7P'r  Then  •  '‘eerly,  a  scheme  which 

still  ?“:'!Pl?  by  tUO  °nly  those  eddresses  between  A  and  B  and 
““}de  thl  .J ransp"ent  ‘°  the  program  being  monitored  would 

SUCh  3  Scheme  »e  implemented 
n  a  straightforward  manner.  Let  us  define  two  registers  A  and 
n  the  machine,  whose  contents  are  to  be  specified  by  the  user 
du^neT"^  ****  load~tlme-  Then,  each  user  address  N  generated 
h  d  ng,anVXeCUtlon  ls  tested  by  hardware  to  see  if  it 
Tf  n  •  ween  A  and  B-  If  SO,  it  is  mapped  into  A  + ( (N-A ) *2 ) =2 N- A 
If  N  is  less  than  A,  it  is  mapped  into  itself.  Otherwise  (it  is 
iarger  than  B) ,  it  is  mapped  into  C+N  where  C  is  a  constant 

equal  to  3™PUted  when  the  reSisters  A  and  B  are  loaded  and  is 
in  th^Lf  f^uL'ranSf0rmati0n’  d6noted  ^  ir  ( A  ,  B  ,  X )  ,  is  illustrate, 


Total  Storage 


Unmonitored 
area :  addressed 
in  single  words 


Monitored  B 
area ;  addressed 
in  double  words 

Unmonitored  A 
area;  addressed 
in  single  words 


G(A,B,X)=(B-A)+X 
for  X > B 


G(A,B,X)=A+((X-A)*2) 
=2X-A  for  A<X<B 


\  G (A , B , X) =X  if  X <A 


136 


This  technique  can  be  generalized  to  the  case  involving  M 
monitored  areas, M>1.  Such  a  generalization  requires  the  compa¬ 
rison,  possibly  in  parallel,  of  X  with  the  limit  registers  for 
each  of  the  monitored  areas  and  the  selection  of  a  different 
constant  to  be  added  to  X  or  2X  for  each  position  of  the  memory. 
Thus,  if  there  are  M  such  areas,  with  limits,  A  ,  B  ,  i=l,...,M, 

and  X  is  found  to  be  in  the  Ktli  monitored  area,  then  X  is  mapped 
K-l  K-l 

into  A  +  £  (B  _A  )+2(x-A  )  =  (  £  (B  -A  ) -A  )+2X.  If  X  is  smaller 
K  j=1  J  J  K  j-1  j  j  K 

than  A  ,  it  is  unchanged.  If  it  is  in  an  unmonitored  area 

following  the  Kth  monitored  area,  then  it  is  mapped  into 
K 

H  (B  -A  )+X. 

j=l  j  j 


3.1.3.  Monitoring  with  W  Less  than  W 

H  r 


This  case  is  conceptually  not  very  different  from  the  prece¬ 
ding  two  cases;  however,  its  implementation  will  probably  be 
much  more  inefficient,  especially  if  W  is  not  some  integral 

0 

multiple  of  W  .  I  shall  not  say  much  on  this  case  except  to 
H 

point  out  that  it  can  be  made  equivalent  to  either  of  the  two 
preceding  cases  by  an  address-transformation  mechanism  of  the 
type  described  earlier.  By  choosing  this  transformation  suitably, 
one  can  make  available  a  number  of  bits  in  the  representation  of 
each  word  of  the  object  machine,  so  that  these  bits  can  be  used 
as  described  under  the  "Flag  bits",  "Table  index"  or  "Full  pointer" 
approaches.  The  basic  inefficiency  lies  in  the  fact  that  one 
has  to  access  several  HM  words  to  simulate  access  to  each  OM  word. 

I  conclude  here  the  discussion  of  various  techniaues  for 
implementing  the  hook  mechanism.  The  choice  of  the  appropriate 
technique  for  a  particular  processor  will  depend  on  the  word 
sizes  of  the  two  machines,  and  the  types  of  memory  and  microprog¬ 
ramming  capability  available. 


137 


8.2  Implementation  of  the  Node  Mechani s m 

One  of  the  major  components  of  the  Analysis  Facility  is 
the  Node  Mechanism  which  includes:  the  node  objects,  the  Node 
Trace  table,  the  input/output  sets,  and  the  creation,  maintenance 
and  searching  of  these  data  structures.  Hence,  in  this  section 
I  would  like  to  discuss  feasible  approaches  to  the  implementation 
of  this  mechanism.  I  shall  assume  that  the  Analysis  Facility 
has  available  for  its  use  a  certain  amount  of  main  memory,  a 
much  smaller  amount  of  high-speed  local  memory  and  some  associative 
memory.  Data  structures  will  be  assigned  to  the  type  of  memory 
most  appropriate  to  their  size,  frequency  and  method  of  access. 

I  shall  also  assume  the  existence  of  data  paths  between  the  local 
memory  and  the  main  memory,  and  between  the  associative  memory 
and  the  main  memory  as  well  as  microcode  instructions  to  make 
transfers  along  these  paths. 

The  node  objects  are  created  when  a  node  is  defined.  In  the 
course  of  the  execution,  they  are  accessed  whenever  the  corres¬ 
ponding  node  is  entered  or  exited  or  when  a  monitor  instruction 
refers  to  them.  They  do  not  take  up  very  much  room,  about  8-10 
u°rds  per  object.  Except  for  the  current  node  object,  they  are 
normally  not  accessed  very  often.  Hence,  an  appropriate  storage 
allocation  for  node  objects  would  be  to  keep  them  in  main  memory, 
except  for  the  current  node  object  which  will  be  brought  into 
the  local  memory  when  the  corresponding  node  is  entered,  maintained 
in  the  local  memory  during  the  current  node  instance  and  pit  back 
in  its  earlier  position  in  main  memory  when  the  current  instance 
is  exited. 

The  Node  Trace  table  is  a  dynamically  growing  structure  whose 
size  is  a  direct  function  of  the  number  of  node  instances  executed. 
Here  too,  normally  only  the  table  entry  for  the  current  node 
instance  is  accessed  often.  Hence  this  latter  part  can  be  kept 
in  local  memory  in  the  same  manner  as  the  current  node  object  and 
the  rest  of  the  table  can  be  kept  in  main  memory. 

The  same  remarks  also  apply  to  the  input/output  sets  with 
one  important  qualification:  the  maintenance  of  the  current 

input/output  sets  will  probably  be  best  implemented  through  an 
associative  memory.  This  is  due  to  the  fact  that  one  has  to 
search  the  current  input  or  output  set  for  every  generated  address 
during  the  current  node  instance.  If  the  address  is  not  found, 
it  must  be  added  to  current  input  or  output  set.  If  this  is  done 
via  a  sequential  search  of  these  sets,  the  resulting  overhead 
is  likely  to  be  unacceptable.  Thus,  the  current  input/output 
sets  should  be  created  and  built  up  in  associative  memory  and 
transferred  to  main  memory  and  linked  to  the  I/O  set  list  of  the 
associated  node  when  the  current  node  instance  is  exited. 


138 


Another  point  worth  mentioning  with  respect  to  the  1/0 
sets  is  the  nesting  of  these  sets  if  nested  nodes  are  permitted. 
Suppose  there  are  n  levels  of  nested  I/O  sets:  what  is  the 
best  way  to  maintain  them  -  to  maintain  all  of  them  with  each 
generated  address,  or  to  maintain  the  highest  level  set  only  and 
to  update  the  next  highest  level  (i.e.  its  parent)  only  when  the 
former  is  exited  by  adding  the  appropriate  entries  from  the 
highest  level  set  into  the  next  highest  one?  Both  approaches 
are  feasible.  The  sophistication  of  the  associative  memory  avail¬ 
able  and  the  overhead  of  the  two  approaches  will  determine  the 
preferable  alternative  for  a  particular  implementation. 

8.3.  The  Interface  between  the  Analysis  Facility  and 
the  Central  Processor 

Since  the  Analysis  Facility  requires  access  to  much  information 
inside  the  CPU  and  to  the  nain  memory,  and  since  it  needs  the 
ability  to  interrupt  the  CPU,  it  is  worthwhile  to  consider  the 
interface  between  the  Analysis  Facility  and  a  "conventional 
central  processor.  I  shall  not  go  to  great  detail  in  doing  this 
however;  hence  I  shall  not  refer  to  a  specific  processor,  but 
rather  to  one  which  is  representative  of  contemporary  architecture. 

The  interface  between  the  analysis  facility  and  the  central 
processor  consists  of  data  and  control  paths  between  the  analysis 
facility  processor  and: 

1-  Main  memory  address  register  (data  path),  MARP , 

2-  Main  memory  data  register  (data  path),  MDRP, 

3-  M--  nemory  access  control  (control  path),  MACP, 

A-  General  registers  (data  path),  GRP, 

5-  Internal  registers  (data  path),  IRP. 

6-  CPU  control  logic,  (control  path),  CLP. 

The  first  three  paths,  MARP,  MDRP  and  MACP,  permit  the 
analysis  facility  to  access  the  main  memory  address  and  data 
registers  as  well  as  main  memory  locations.  The  path  GRP  gives 
access  to  the  contents  of  the  general  registers. 

The  path  IRP  gives  access  to  internal  registers  which  contain 
information  about  the  opcode  and  operand  fields  obtained  by  the 
CPU  by  decoding  the  current  instruction  (e.g.  for  the  PDP-11, 
these  would  be  current  opcode,  source  and  destination  mode  and 
registers  etc.).  In  the  existing  design  of  some  processors,  this 


139 


nformation  may  not  be  explicitly  kept  in  this  form  during  the 
en  ire  execution  cycle.  In  such  a  case,  either  the  processor 
design  may  be  modified  to  make  this  information  available  or 
an  instruction  decoder  may  be  built  into  the  Analysis  Facility 
which  can  extract  the  required  information. 


The  path,  CLP,  to  the  CPU  control  logic  is  a  control  path 
which  serves  to  synchronize  the  activities  of  the  Analysis  Facility 
nd  the  CPU.  In  particular,  it  will  conduct  signals  from  the 
ormer  to  the  latter  to  inhibit  and  enable  instruction  execution. 


JLtii _ The  Analysis  Facility  Processor  (AFP) 

So  far,  we  have  said  nothing  about  how  the  Analysis  Facility 
will  function,  its  instruction  set  and  internal  organization. 

hluL^t  1S  n0t  deslrable  to  go  into  much  detail  here,  it  is 
probably  worthwhile  to  outline  the  answers  to  these  questions. 

...  ^he  question  of  thow  the  Analysis  Facility  Processor  (AFP) 
will  function,  i.e.  "will  it  have  its  own  instruction  execution 
hardware  or  will  it  share  that  of  the  object  machine?",  and  the 
question  of  the  instruction  set  of  the  AFP  are  interrelated. 

ecalimg  the  two  subsets  of  the  instruction  set  of  DAMF ,  namely 
the  conventional"  subset  and  the  "monitoring  and  analysis" 
subset,  it  is  clear  that  if  the  AFP  uses  the  same  instruction 
set  as  the  object  machine,  then  the  monitoring  and  analysis  instruc¬ 
tions  must  be  compiled  into  the  conventional  subset,  which  can 
then  be  directly  executed  by  the  AFP. 


This  approach  has  the  advantage  of  not  requiring  a  seperate 
instruction  set  processor  for  the  AFP.  However  it  also  requires 
that  the  internal  state  of  the  object  machine  CPU  be  saved  before 
the  AFP  can  do  anything.  Also,  if  the  CPU  is  to  be  monitorable 
while  it  is  being  used  by  the  AFP,  then,  in  fact,  the  internal 
state  of  the  CPU  has  to  be  saved  in  a  stack,  to  permit  an  orderly 
return  from  the  various  levels  of  monitoring  and  analysis  activity, 
further,  the  object  machine  instruction  set  would  have  to  be 
extended  to  permit  access  to  the  internal  registers,  MARP  ,  MDRP 
etc.,  perhaps  requiring  new  instructions.  Finally,  the  instruction 
set  of  the  object  machine  will  no.  necessarily  be  suitable  to 
perform  the  monitoring  and  analysis  actions  described  in  Chapters 

n a m i? d  3’  In  Particular*  if  the  list-structure  orientation  of  the 
DAME  system  is  also  adopted  in  the  design  of  the  AFP,  one  would 
really  prefer  to  have  a  machine  instruction  set  suitable  for  list 
processing.  For  these  reasons,  my  preference  would  be  a  seperate 
instruction  set  processor  for  the  AFP,  both  from  the  performance 
point  of  view  and  the  freedom  it  affords  in  defining  new  types 
of  operations.  Hence,  in  the  design  for  the  AFP  whose  outline 
is  given  in  the  following  illustration,  I  assume  a  seperate  kFP, 


140 


storage^  IZllTllln  ^  ‘VPes  of 

-one  associative  memory  as  d  I  sc  us  safari  Jem  *  S'°raee  a"d 

iniplementationsKo^1designIhvith°obt  t°.revieu  several  reported 
tot  all  less  ambitious  L.,  ours  roughly  similar  to, 

1*1  ^ostrumentatio^of'programs  foT^fxDS  Si  "T1"'" '  ‘»«lvin* 

*■  s-izviv ■*  ”«■«. .’ier:1' 

"vernier-scale”  technique  for L*/' ?  “  desCript*°n  of  a 

times  to  a  much  higher  resolution  ih1^  lnstruction  and  event 
of  the  hardware  clock,  with  a  precision'll  •J”!  betWeen  the  "ticks" 
cision  with  which  the  hardware  cl  r!.°nly  by  the  pre" 

Paper  makes  it  a  "must”  reading  for  thn  ^  part  °f  the 

instrumentation.  Apart  from  f u  •  f  those  interested  in  program 

a  good  discussion  o?  tL  ov^rLad  the  «»l>«  cSntSi" 

(simulation  artifact")  and  by  ®secution  under  simulation 

the  Sigma  7  and  in  S360/75.  ^  e  Execute  instruction  in 

in  instrumentation  ^mat  "n "r"  L'”"”  C0 «  oomputer 
which,  to  the  best  of  thu  ^ut^o^s  k"wedv  ^  des':riia  -  design 
for  an  automaton  connected  to  the  obierl  ^  n£?Ver  1 mp lemen t ed ) 
which  would  select  user-spe  f  J  computer  by  sensors, 

o  the  execution  and  t  ransform^  thi  s'data  in  the  course 

element  asseciated  with  that  event  Th*  lnt0  3  C°Unt  kept  in  an 

its  time  (1967)  ln  some  of  the  idea*  P3pRr:  while  ahead  of 

specified  the  crucial  questions  of  fi)^086^  ln  1 1  ’  leaves  un- 
in  from  the  object  computer  is  convertpH^  the  r3W  d3t3  comin8 
bit  table  whose  elements  indicate  the  s’  lndex  for  the 

of  the  events  bearing  that  index  (ii)  °r  1  n  s  1  8"  i f  i  cane  e 

o  the  object  computer  (IBM  7094)  arp  3  parts  °f  the  state 
COMPUTER,  (iii)  what  ki;d  of7°9^  are  accessible  by  the  SNUPER 

“  5PeClfy  thd  graphic  display ^and^th^likef  ^  thC  US<!r 
memory-based0desiginforeas7sociatin^biStbrl<'fly  an  assoclatl»e 

store  addresses.  His  design  incluL  e£rupt  r°otines  with  fetch/ 
m  the  associative  table  entry  wh i r h S • • C °nd 1 1 °n  code"  field 
than",  "equal  to"  and  "greater  than"  which  of  the  "less 

the  address  currently  being  acceded  a  ,3  *°ns  mUSt  h°ld  between 
the  table  entry  in  order  for  the^transfer  of  3ddreSS  Stored  in 
address  specified  in  the  table  en  rv  control  to  a  second 

this  design  is  the  basic  straiphrf  ’  Apart  from  this  feature, 
addresses.  However,  in  this  author's  3ppr°ach  to  monitoring 

table  um  «•"  -  — 


envisioned  by  Zelkowi*-?,  's  design.  Also,  this  design  does  not 
provide  for  monitoring  accesses  to  the  general  registers. 

Two  other  reports  on  hardware-monitor  based  approaches  should 
be  mentioned  here.  One  is  the  report  on  the  Neurotron  monitor 
by  R.  A.  Ashenbrenner  et  al  [ALN  71],  and  the  other  is  the  report 
on  a  hardware  monitor  for  a  multi-mini  processor  (C.mmp)  system 
by  S.  H.  Fuller  et  al  f  F  S  W  73].  Both  of  these  monitors  appear 
to  be  oriented  toward  data  selection  and  collection  and  not  the 
full  spectrum  of  general  purpose,  dynamic  analysis  activities 
envisioned  in  this  thesis. 

The  paper  which  comes  closest  in  spirit  and  approach  to  those 
described  in  this  thesis  is  that  of  H .  J.  Saal  and  L.  J.  Shustek 
[SS  72].  In  this  paper  entitled  "Microprogrammed  Implementation 
of  Computer  Measurement  Techniques",  the  authors  report  on  a 
project  in  which  a  Standard  Computer  Corporation  IC7000  computer, 
which  contains  a  writable  control  store,  was  microprogrammed  to 
collect  (i)  execution  history  data  by  recording  all  successful 
branch  instructions  and  relocation  information,  and  (ii)  distri¬ 
butions  of  the  usage  of  operation  codes  and  consecutively  executed 
operation  code  pairs.  What  is  of  interest  to  us  here,  is  not 
the  actual  data  or  the  types  of  data  collected,  but  rather  the 
insights  they  provide  on  the  problems  of  inserting  measurement 
routines  in  emulators.  This  paper  is  also  "must"  reading  -for 
those  interested  in  building  microprogrammed  instrumentation 
facilities.  Since  these  problems  are  so  relevant  to  this  discus¬ 
sion,  I  outline  them  here: 

(i)  "...  since  microprogram  storage  is  an  extremely  scarce 
commodity,  it  was  prohibitively  expensive  to  insert  measurement 
routines  throughout  the  microprogram."  Thus,  in  the  Analysis 
Facility  Processor,  the  power  of  the  instruction  set  might  be 
limited  by  the  microprogram  storage  available. 

(ii)  "Since  our  microcomputers  possess  a  limited  subroutining 
facility  at  the  microprogram  level,  it  was  not  even  feasible  to 
include  a  subroutine  call  at  every  point  at  which  we  wished  to 
measure  the  performance  of  the  system."  This  is  an  example  of 

the  problems  caused  by  the  primitiveness  of  the  microprocessor 
instruction  set.  More  on  the  same  point:  "A  severe  problem 

found  in  the  implementation  of  extensions  via  microprogramming, 
generally  not  found  in  conventional  software  interpreters,  arises 
from  the  lack  of  many  general  facilities  at  the  microprogram  level. 

(iii)  "In  addition,  many  instructions  are  executed  directly 
in  hardware  at  instruction  fetch  time  (most  of  the  program  transfer 
instructions).  Others  share  microcode  but  are  semantically  dis¬ 
tinguished  by  a  large  number  of  flip-flops  (set  by  the  hardwired 
instruction  fetch  and  decode)  which  perform  extensive  residual 


142 


control."  Those  flip-flops  nay  well  include  data  about  addressing 
modes,  the  success  of  a  conditional  branch  etc.  and  should  be 
accessible  by  the  measurement  routines.  More  on  the  problems 
caused  by  hardware  interpretation:  "Microprogram  machines  are 

generally  not  completely  microprogrammed.  Many  aspects  of  instruc¬ 
tion  decoding  and  operand  fetching  may  be  performed  in  a  hardwired 
scheduler  in  the  interest  of  increased  efficiency.  This  technique 
conflicts  with  microprogram  measurement.  The  hardwired  decoding 
scheme  may  automatically  set  a  variety  of  residual  control  regis¬ 
ters  and  flip-flops  to  simplify  the  semantic  emulation  routines. 
Current  microprocessors  have  not  been  designed  to  allow  these 
registers  to  be  explicitly  read  by  an  emulator  and  thus  they 
are  not  evailable  to  measurement  routines.  This  lack  of  generality 
imposes  unnecessary  complications  to  the  microprogrammer,  but 
could  be  avoided  in  future  microprocessor  design." 

(iv)  "The  Input/Output  conflict  between  the  microprogram 
measuring  routine  and  the  system  being  measured  was  the  single 
most  difficult  problem  xn  the  implementation".  The  authors 
recommend  that  the  two  systems  use  different  channels  for  input/ 
output. 

All  of  these  points  are  candid  examples  of  the  problems  which 
arise  in  the  design  and  implementation  of  microprogrammed  execution 
analysis  facilities.  They  emphasize  several  points  already  made 
in  the  discussion  of  the  AFP  above:  namely,  the  need  for  a  power¬ 

ful  instruction  set,  access  to  object  machine  internal  registers 
and  seperation  of  the  object  machine  hardware  from  that  of  the 
AFP  . 


.... 


. -  * 


Illu 


OBJECT  MACHINE 


CONCLUSIONS 


The  objectives  of  the  research  were  (i)  to  explore  the 
possibility  of  designing  facilities  which  are  significantly 
more  helpful  to  the  user  in  many  tvpes  of  execution  analysis 
than  existing  systems,  (ii)  to  identify  key  problems  areas, 

(iil)  to  propose  solutions  to  some  of  them,  and  (iv)  outline 
directions  of  research  for  solving  others. 

The  term  "execution  analysis"  covers  many  important  areas, 
such  as  debugging,  control  flov,  data  flow,  performance  measurement 
and  storage  reference  pattern  analysis.  The  main  contribution 
of  the  thesis  is  the  development  of  a  framework  which  faciliti- 
tates  analysis  tasks  in  all  of  these  areas.  A  prototype  of  this 
framework,  called  DAME  (Dynamic  Analysis  and  Modelling  Environment) 
has  been  implemented  on  the  P DP-10  to  study  the  behaviour  of 
PDP-11  programs.  Its  most  novel  aspect  is  that  it  permits  the 
user  to  define  an  abstract  structure  over  his  program  at  run-time 
and  perform  his  analysis  in  terms  of  the  elements  of  that  structure, 
called  "nodes".  A  node  is  a  segment  of  code,  not  necessarily 
contiguous  in  space,  having  a  unique  entry  point  and  a  unique 
exit  point.  Every  execution  of  a  node  is  called  an  "instance" 
of  that  node.  During  each  node  instance,  DAME  constructs  a  list, 
called  the  "input-set",  of  all  the  inputs  used,  and  upon  exit, 
a  list,  called  the  "output-set"  of  the  changes  to  the  svstem  state 
caused  by  the  node  instance.  The  input-set  of  a  node  instance 
I  is  defined  as  the  set  of  pairs  A  ,  B  >  where  A  is  an  address 
whose  contents  were  read  by  I  before  being  modified  for  the  first 
time  bv  I,  and  B  is  the  value  read.  Thus  the  input-set  of  I 
represents  all  the  outside  Information  passed  to  I.  The  output-set 
of  I  consists  of  pairs  *C,D  where  C  is  an  address  written  into 
by  I  and  D  the  last  value  written.  The  significance  of  this 
formulation  of  in put/ output  sots  is  that  it  not  only  permits 
backtracking  to  any  arbitrary  point  in  the  execution  history, 
but  also  facilitates  the  determination  of  data  flow  between  nodes. 

This  formulation  is  also  very  helpful  in  narrowing  down  the  search  for 
an  elusive  bug  to  a  particular  node  instance  during  debugging. 

Another  significant  advantage  of  this  approach  is  that  it  gives 
the  user  the  ability  to  control  the  amount  of  information  collected 
by  the  system  through  the  judicious  definition  of  nodes.  Other 
systems,  which  record  every  store  and  every  branch  operation, 
require  much  more  storage  to  represent  the  same  length  of  execution. 

Tn  addition  to  the  node  mechanism,  DAME  offers  a  flexible 
mechanism,  called  the  "hook  Mechanism",  which  allows  the  user  to 
trigger  monitoring  and  analysis  actions  at  a  wide  variety  of 
points  in  the  PDP-11  instruction  cycle  and  at  entry  and  exit  from 
nodes.  By  using  the  node  and  hook  mechanisms  and  the  comprehensive 


1  45 


instruction  set  of  DA ML,  which  includes  general- purpose  compu¬ 
tational  instructions  as  well  as  instructions  specifically 
designed  for  monitoring,  collecting  ana  searching  collected 
data,  the  user  can  in  most  cases  easily  formulate  DAME  routines 
to  perform  the  analysis  he  is  interested  in.  In  Chapter  4, 
five  example  of  the  application  of  DAME  to  data  flow  analysis, 
control  f  1  ov;  analysis  and  instruction  mix  analysis  are  given. 

The  primary  attribute  sought  in  the  design  of  DAME  was 
£  legibility  .  This  goal  resulted  in  a  list-oriented  design; 
each  PDP-11  core  location  has  a,  possibly  empty,  list  of  "interes¬ 
ting  objects  associated  with  it,  e.g.  node  descriptors,  hook 
descriptors,  empirical  data  saved  there  bv  the  analysis  system, 
a  list  of  previous  values.  Each  DAME  object  can  have  a  .secondary 
attribute  list  which  can  contain  system-defined  or  user-defined 
attribute  descriptors  and  arbitrary  information  associated  with 
the  object.  The  DAME  routines  themselves  and  the  DAME  symbol 
table  are  lists  manipulable  with  the  standard  list  functions. 

Ihe  price  for  flexibility  is  usual lv  loss  of  efficiency. 

A  particularly  stiff  price  was  paid  for  the  flexibility  afforded 
by  the  design  of  the  PDP-11  simulator  at  the  memory  cycle  level. 
The  motivation  for  this  choice  was  the  prospect  that  DAME  might 
be  used  for  analvses  involving  events  at  Unibus  transfer  level. 
Also,  it  had  been  envisioned  that  simulators  for  several  I  /  n 
devices  capable  of  generating  NPR  commands  which  could  interrupt 
the  CPU  after  a  memory  cycle  within  an  instruction,  could  be 
attached  to  the  basic  simulator.  In  hindsight,  this  decision 
seems  ill-advised;  or  perhaps,  ironically,  too  inflexible.  This 
choice,  coupled  with  the  general-purpose  scheduling  mechanism 
used  for  timing  events,  has  caused  DAME  to  spend  two-thirds  of 
its  time  in  the  simulation  scheduler,  and  there  is  no  wav  to  get 
around  this  in  the  present  design.  A  much  better  design  would 
have  been  to  provide  an  option  to  the  user  as  to  whether  to 
simulate  at  memory  cycle  level  or  at  instruction  level  or  possiblv 
even  at  subroutine  level.  This  would  permit  both  detailed  memorv- 
c\cle— level  studies  over  a  short  simulated  time,  and  debugging, 
flow  analysis  and  performance  measurement  studies  which  require 
and  order  of  magnitude  longer  periods  of  simulated  time.  These 
latter  areas  make  up  the  bulk  of  the  applications  of  DAME  and 
hence  should  have  been  given  more  emphasis. 

The  DAME  language  has  proved  unsatisfactory  in  some  areas. 

Its  main  disadvantage  is  the  low-level  instructions  supplied  for 
conventional  computing  tasks  (e.g.  arithmetic).  These  are  equiva¬ 
lent  to  those  of  a  3-address  hardware  machine.  But,  this  design 
was  in  fact  intended  to  provide  a  model  for  a  possible  hardware 
implementation  and  it  was  felt  that  a  higher-level  language  can 
subsequently  be  implemented  to  compile  into  the  DAME  language. 


This  task  has  not  been  done.  Such  a  language  would  make  DAME 
easier  to  use. 

The  subset  of  DAME  instructions  dealing  with  monitoring, 
data  collection  and  retrieval  have  proved  quite  comprehensive 
in  their  coverage  and  easy  to  use.  While  this  subset  could 
certainly  be  enhanced  by  the  implementation  of  higher-level 
primitives  discussed  in  Chapter  6,  such  as  the  FOP EACH  statement 
in  LEAP  and  continuously  evaluated  expressions,  the  provided 
facilities  have  proved  quite  useful  and  also  quite  easy  to  trans¬ 
port  to  a  higher-level  language.  Their  transportability  to  a 
higher-level  language  as  demonstrated  in  Chapter  7,  and  the  fact 
that  their  design  was  based  on  the  requirements  set  forth  in 
Chapter  2,  indicate  that  the  specifications  in  Chapter  2  are 
indeed  independent  of  the  analysis  language  level . 

In  the  final  chapter,  Chapter  8,  we  consider  a  class  of 
questions  whose  solution  could  have  a  significant  impact  on  the 
extent  to  which  execution  analysis  facilities  are  used  by  applica¬ 
tion  and  system  programmers  alike.  These  nuestions  relate  to 
the  hardware  implementation  of  the  primitives  which  are  most 
burdensome  and  cause  most  of  the  overhead  in  software.  We  did 
not  attempt  to  solve  these  problems;  our  intention  was  only  to 
pose  the  right  questions  and  suggest  approaches  to  their  solution 
A  real  solution  to  these  questions,  due  to  the  major  design  tasks 
which  still  remain,  would  require  a  detailed,  engineering  level 
design  and  possibly  implementation,  testing  and  trial  use. 

In  summary  then,  we  have  shown  that  execution  analysis 
facilities  significantly  more  powerful  and  widely  applicable  than 
the  existing  systems  for  individual  types  of  analyses,  such  as 
debugging  and  performance  measurement,  can  be  built  using  current 
technology.  While  the  prototype  implementation  appears  too 
expensive  for  wide  use,  a  more  cost-conscious  design  and  some 
assistance  from  hardware  can  bring  the  cost  d o w n  substantially. 

We  hope  that  the  ideas  demonstrated  in  this  thesis  will  shed 
some  light  on  the  problems  involved  and  point  the  way  to  some  of 
the  solutions. 


147 


REFERENCES 

L ACM  7  3  j  Proceedings  of  "Workshop  on  Virtual  Computer  Systems  , 

ACM  SIGARCH-SICOPS ,  1973. 

[AS  71J  Aschenbrenner ,  R.  A.,  et  al ,  "The  Neurotron  Monitor 
System",  Proc.  FJCC  39(1971). 

[ BA  6  7  J  Balzer,  R.  M.,  "EXDAMS  -  Ex t end ab 1 e  Debugging  and  Monito¬ 
ring  System",  Proc.  FJCC,  1971. 

[BE  66  J  Bernstein,  A.  J.,  "Analysis  of  Programs  for  Parallel 

Processing",  IEEE  Transactions  on  Electronic  Computers, 

Oc  tober  ,  1966 . 

[BO  68  J  Bernstein,  W.  A.,  Owens,  J.  J.,  "Debugging  in  a  Time¬ 
sharing  Environment",  Proc.  FJCC,  1968. 

[BK  70]  Bussell,  B.,  and  Koster,  R.  A.,  "Instrumenting  Computer 
Systems  and  Their  Programs",  Proc.  FJCC,  1970. 

[CO  7 1 j  Cocke,  J.,  "On  Certain  Graph-Theoretic  Properties  of 
Programs",  IBM  Research  Report  RC  3391,  1971. 

[DEC  71]  "PDPll / 20 , 15  ,  r20  Processor  Handbook",  Digital  Equipment 
Corp . ,  Maynard,  Mass.,  1971. 

[DEC  7  3  J  "BLISS-11  Programmer's  Manual",  Digital  Equipment  Corp., 
Maynard,  Mass.,  1973. 

[ED  66]  Evans,  T.  G.,  Darley,  D.  L.,  "On-line  Debugging, 

Techniques;  A  Survey",  Proc.  FJCC,  1966. 

[ED  65]  Evans,  T.  G.,  and  Darley,  D.  L.,  "DEBUG-An  Extension 
to  Current  Online  Debugging  Techniques",  CACM  Vol.  8, 

No  .  5  . 

[ES  67]  Estrin,  G.,  et  al,  "Snuper  Computer-A  Computer  in 
Instrument'-;  ion  Automaton",  Proc.  SJCC,  1967. 

[FE  73]  Feus  tel  ,  E.  A.,  "On  the  Advantages  of  Tagged  Architecture", 
IEEE  Transactions  on  Computers,  Vol.  c-22,  No.  7,  July 
197  3  . 

[FI  70]  Fisher,  E.,  "Control  Structures  for  Programming  Languages", 
Ph.D.  Thesis,  Carne g i e -Me  1 1  on  University,  1970. 

[FR  6  9  J  Feldman,  J.  A.,  and  P.  D.  P.ovner,  "An  ALGOL-Based 

Associative  Language",  CACM  12.  Vol.  8,  August  1.69. 


148 


r  F  S  W  73;  Fuller,  S.  1'.,  R.  J.  Swan  and  W.  A.  Wulf,  "The 

Instrumentation  of  C.pimp,  a  Multi-(Mini)  Processor", 

Free,  of  COM  PC  uN  73,  IFFF.  Computer  Society. 

[  6A  69  Gaines,  R.  Stockton,  "The  Debugging  of  Computer  Programs", 
Institute  for  Defense  Analyses,  Working  Paper  No.  266, 
August,  1969. 

!.  KN  73  i  Knuth,  Donald  E.,  The  Art  of  Computer  Programming, 

Vol.  3,  Addis  on-Wesles,  1973. 

[LA  65  Lampson,  B.  V,'.,  "Interactive  Machine  Language  Programming", 
Proc.  FJCC  1965. 

! LA  72  Lang,  B.,  "A  New  Technique  for  Data  Monitoring",  ACM 
SIGPLAN  Notices,  Vol.  7,  No.  6,  June  1972. 

[ LU  71J  Lunde,  A.,  "POOMAS-Poor  Man's  Simula",  Unpublished 

user  manual  for  the  POOMAS  simulation  package  available 
at  CMU  Computer  Science  Dept. 

[ME  67 j  Martin,  David  F .  and  Estrin,  Gerald,  "Experiments  on 

Models  of  Computations  and  Systems",  IFEE  Transactions 
on  Electronic  Computers,  February,  1967. 

1  MCN  68  1  McNelev,  J.  L.  ,  "Compound  Declarations",  in  Stimulation 
Programming  Languages,  ed.  J.  N.  Buxton,  No  r  t  h- 11  o  1 1  an  d  , 

1968  . 

[MI  70  Mitchell,  J.  G.,  "The  Design  and  Construction  of  Flexible 
and  Efficient  Interactive  Programming  Systems",  Ph.D. 
Thesis,  Carnegie-Mellon  University,  June  1970. 

[ RU  71 1  Rustin,  R. ,  "Debugging  Techninues  for  Large  Systems", 

Courant  Computer  Science  Symposium  1,  Courant  Institute 
of  Mathematical  Sciences,  New  York  University, 

Prentice- Hall,  1971. 

I  S  S  72,  Saal,  H .  J.  and  I. .  J.  Shustek,  "Microprogrammed  Implemen¬ 
tation  of  Computer  Measurement  Techniques",  Proc.  5. 

Annual  ACM/ SIC MICRO  Workshop  on  Microprogramming, 

University  of  Illinois,  Urbana,  Illinois. 

[ST  65  Stockham,  T.  G. ,  "Some  Methods  of  Graphical  Debugging", 
Proc.  IBM  Scientific  Computing  Symposium  on  Man- Machine 
Communication,  May,  1965. 

WI  67  J  Wilde,  D.  IJ .  ,  "Program  Analysis  Digital  Computer", 

Ph . D.  Thesis ,  MIT,  1967 . 


149 


[WU  71J 

[WU  72] 

[ZE  71] 


Wulf,  W.  et  al.,  "Bliss  Reference  Manual",  CMU  Computer 
Science  Department  Research  Feport,  January,  1971. 

Wulf,  W.  ,  C.mmp:  A  Mul t i-Mi n i -Pr o c e s so r  "  ,  Computer 
Science  Research  Review  1971-72,  Department  of  Computer 
Science,  Ca r ne g i e -Me  1 Ion  University,  1972. 

Zelkowitz,  M.  ,  "Interrupt  Driven  Programming"  CACM, 

June  1971  Vol.  14,  No.  6. 


i 

( 


APPENDIX  A 


CONTENTS 


Introduction  to  DAME 
The  Hook  Mechanism 
The  Node  Mechanism 
Data  Elements  of  DAME 

Procedure  for  Getting  Started  with  DAME 

Instruction  Format 

Monitor  Machine  Instruction  Set 

Commands  for  Creating  Monitor  Routines 

Load  Monitor  Routine  (LMR) 

Define  Monitor  Routine  (DMR) 

PDP-11  Flow  Control  Commands 

Run  (RUN) 

Go  (GO) 

Stop  (STOP) 

Stop  Conditional  (STOPC) 

Node  (NODE) 

Node  Trace  ( NTR) 

Along  (ALONG) 

Restore  to  Node  Instance  (REST) 
Replay  Node  Instance  (RPLAY) 

Monitor  Routine  Flow  Control  Commands 

If  (IF) 

While  (WHL) 

Incr  ( I  NCR) 

Execute  (EX) 

Push  (PUSH) 

Pop  (POP) 

Return  (RET) 


Page 


151 


Type-Out  Commands 

Type  Object  (TOBJ) 

Type-Indirect  Object  (TIOEJ) 

Type  -10  Symbol  (TY10) 

Type  Contents  of  -13  Addresses  (T) 
Type  Immediate  (TI) 

Type  Node  Instances  (TNI) 

Type  Node  Objects  (TNO) 

Insert  Commands 

Insert  in  -11  Address  (I) 

Zero  -11  Addresses  (Z) 

Insert  in  Object  (IOBJ) 

Insert  Halfword  (IHW) 


Commands  to  Create  and  Delete  Objects 

Create  Object  (CR) 

Delete  Object  (DEL) 

Hook  Manipulation  Commands 

Hook  (HOOK) 

Disable  Hook  (DISAB) 

Enable  Hook  (ENAB) 

Commands  for  Searching  PDP-11  Execution  History 

Find  Input-Set  (FISET) 

Find  Output-Set  (FOSET) 

Find  Value  (FVAL) 

Find  Node  Instance  (FNI) 

Find  Node  Object  (FNO) 

Value-trace  Commands 

Initialize  Value-trace  (IVT) 

Value-trace  Hook  (VTH) 

Disk  I/O  Commands 


Page 


16  A 


167 


168 


169 


171 


1 7  A 


175 


Write  Disk  (WDSK) 
Write-Indirect  Disk  (WIDSK) 
Read  Disk  (RDSK) 


152 


Page 

Miscellaneous  Commands  176 

Load  PDP-11  Program  (LOAD) 

Generalized  Unary  Operation  with  Assipnment  (UA) 
Generalized  Binary  Operation  with  Assignment  (BA) 
Execute  External  (XX' 

Evaluate  (EVAL) 

Time  (TIME) 

Plot  (PLOT) 

A  List  of  Useful  Global  PDP-10  Symbols  and  Their  Contents  179 
The  Octal  Value  of  OPN  for  Each  Opcode  180 


153 


I 

Introduction  to  D  A  M  E 


DAME  (Dynamic  Analysis  and  Modelling  Environment)  is  an 
environment  for  running  PDP-11/20  proprams  on  the  P DP-10  and 
analyzing  their  execution.  It  contains  a  fairlv  rich  instruction 
set  containing  the  facilities  of  a  low-level  programming  language 
and  a  set  of  facilities  for  controlling  the  execution  on  the 
PDP-11  and  the  dynamic  collection  ard  searching  of  data.  (We 

shall  refer  to  DAME  instructions  also  as  DAME  commands).  Any 

DAME  command  can  be  executed  immediately  or  in  a  DA^F  routine. 

A  DAME  routine  can  either  be  defined  on-line  by  using  the  Define 
Monitor  Routine"  (DMR)  command  or  it  can  be  prepared  ahead  of 
time  in  an  SOS  file  with  the  extension  .DAM  and  subsequently 
loaded  with  the  "Load  Monitor  Routine"  (I. MR)  command.  The  latter 
mode  of  operation  is  highly  recommended  since  SOS  has  much  better 
editing  facilities  than  DAME  and  one  quick Iv  gets  tired  of 
entering  the  same,  commands  repeatedly.  LMR  commands  can  be  nested 
in  the  sense  that  any  executed  routine  can  load  and  execute  other 
routines,  achieving  a  hierarchical  loading  effect.  This  is  a 
very  convenient  mode  of  operation. 

PDP-11  programs  are  loaded  from  a  binary  (.BIN)  file  using 
the  LOAD  command.  They  are  executed  by  using  the  RUN  or  GO 
commands . 


The  Hook  Mechanism 


The  principal  mechanism  by  which  the  user  causes  DAME  to 
take  some  action  while  his  program  is  running,  is  the  Hook 
Mechanism . 

There  are  two  classes  of  hooks:  general  hooks  and  addressed 

hooks.  Within  each  class  there  are  several  types.  The  type  and 
class  of  each  hook  is  indicated  by  a  mnemonic  character  constant 
in  the  Hook  command.  General  hooks  are  those  in  which  a  user- 
specified  monitor  routine  will  be  executed  at: 

1-  Every  fetch  operation  (hook-type  ’GF)  or, 

2-  Every  store  operation  (type  '  G S )  or, 

3-  Every  instruction  fetch  (type  'IF)  or, 

4-  Every  instruction  completion  (type  I C )  or, 

5-  Every  node  entry  (type  ’NE)  or, 


■ - -  — — 


154 


6-  Every  node  exit  (type  'NX). 

(Nodes  are  explained  later.) 

Addressed  hooks  are  those  in  which  the  user-specified  monitor 
routine  will  be  executed  only  if  the  specified  type  of  operation 
is  performed  on  an  address  in  a  given  range.  The  types  of  opera¬ 
tions  are: 


7- 

Every 

fetch 

from 

an  address 

range 

(type 

'AF)  , 

8- 

Every 

store 

into 

an  add  re  ss 

range 

(type 

’AS)  , 

9- 

Everv 

instruction 

fetch  from 

an  address 

range 

(type  ' A  I F)  , 


10-  The  completion  of  every  instruction  fetched  from  an 
address  range  (type  'AIC). 

The  user  determines  what  actions  he  would  like  taken  at  one 
or  more  of  the  above  points.  He  then  prepares  a  monitor  routine 
(by  a  DMF  command  or  by  loading  from  a  .DAM  file  with  a  LMR 
command)  and  issues  a  Hook  command  giving  as  parameters:  the 
type  of  hook,  the  routine  name,  if  an  addressed  hook  then  the 
address  range,  and  a  name  for  the  hook  (consisting  of  a  character 
string  up  to  5  characters  preceded  by  a  single  auote  mark)  with 
which  he  can  refer  to  the  hook  later  on.  He  can  place  as  many 
of  any  type  of  hook  as  he  wants.  The  routines  which  are  thus 
referenced  in  a  hook  specification  must  be  defined  prior  to  the 
first  activation  of  the  hook.  In  practice,  all  monitor  routines 
are  usually  defined  prior  to  the  initiation  of  the  execution  of 
the  PDP-11  program. 


The  Node  Mechanism 


A  second  important  mechanism  by  which  the  user  collects 
information  about  the  behaviour  of  his  program,  is  the  so-colled 
"Node  Mechanism".  The  Node  Mechanism  reflects  a  certain  view, 
that  held  by  DAME,  of  the  notion  of  what  "the  execution  of  a 
program"  means.  It  contains  facilities  for  extracting  information 
in  compliance  with  that  view,  while  the  PDP-11  program  runs. 

The  collected  information  makes  it  possible  to  reconstruct  anv 
previous  state  of  the  -11,  as  well  as  to  answer  questions  about 
data  flow  and  control  flow  history  without  restoring  past  states. 

In  DAME's  view  of  the  world,  interesting  parts  programs 
are  identified  and  divided  into  nodes  by  the  user.  (A  default 
mode  is  also  provided.  See  NTP  command.)  Nodes  can  he  as  small 
as  a  single  instruction  or  as  large  as  the  entire  program. 


155 


Nodes  are  defined  thru  the  NODE  command,  by  specifying  their 
entry  and  exit  points.  Nodes  may  be  nested  but  no  two  nodes 
may  have  the  same  entry  or  the  same  exit  point;  nor  may  nodes 
overlap  partially.  Normally,  the  last  instruction  of  a  node 
is  a  branch  or  subroutine  call  instruction  and  the  first  instruc¬ 
tion  is  the  target  of  a  branch  instruction  or  a  subroutine  entry 
point.  Control  need  not  physically  stay  within  the  starting  and 
ending  addresses  of  a  node;  the  entire  path  followed  by  the 
program  between  the  entry  and  the  exit  from  the  node  will  be 
considered  a  part  of  that  "node  instance".  A  "node  instance" 

(NI)  is  a  parti.  _lar  execution  of  a  node.  Associated  with 
the  concept  of  an  NI  are  the  concepts  of  the  NI's  "input-set" 
and  the  "output-set".  The  "input-set"  consists  of  pairs  (aj,bj) 
where  aj  is  an  address  from  which  the  associated  NI  has  fetched 
something  before  writing  into  it  for  the  first  time,  and  bj  is 
the  value  fetched  from  aj  for  the  first  time  during  the  NI. 

Thus,  the  input-set  represents  all  the  "external  information" 
used  by  the  NI .  The  output-set  consists  of  all  the  addresses 
written  by  a  node-instance  and  the  contents  of  those  addresses 
upon  exit  from  the  NI.  Thus,  it  represents  all  the  information 
passed  to  the  rest  of  the  world  by  the  NI. 

For  each  node  instance,  the  system  creates  a  four-word 
entry  in  a  table,  NODETPACE.  The  format  of  the  entry  is: 

•node  starting  address5 
<instruction  count  at  entry  ■ 

<  input-set  ptr,  ,  output-set  ptr> 

•no.  of  instructions  in  N I > 

In  addition,  associated  with  each  node  is  a  node-object, 
which  contains  pointers  to  lists  of  pointers  to  the  input-  and 
output-sets  of  every  instance  of  that  node.  The  I/O  sets  can 
be  displayed  easily  by  the  TOBJ(<obj.  address5)  command  by 
supplying  the  address  of  the  desired  I/O  set  list  from  the  node¬ 
object.  These  lists  can  also  be  manipulated  in  monitor  routines. 

Finally,  all  node-objects  and  input/output  sets  are 
accessible,  as  most  other  objects  in  the  system  are,  thru  a 
set  of  master  list  pointers,  MNODESC,  MINPUTSETSC  and  MOUTPUTSETSC . 
These  lists,  called  "subclass  masters",  contain  a  pointer  to 
every  object  of  their  respective  subclasses. 

A  set  of  commands  intended  to  facilitate  the  searching  of 
this  execution  history  information  is  provided  (See  "Commands 
for  Searching  Execution  History"). 


156 


Data  Elements  of  DAME 

DAME  has  access  to  three  address  spaces,  each  of  which 
is  handled  in  a  similar,  but  not  identical,  manner.  These 
are: 

1-  PDP-11  core,  general  and  device  registers, 

Global  PDP-10  symbols  declared  in  the  simulator  and  in 
the  rest  of  DAME  code, 

3-  Monitor  Machine  objects  (MMO)  created  by  the  user  during 
the  session  or  pre-defined  for  the  user  by  DAME  during  initiali- 
zat ion  . 

A  list  of  the  useful  elements  of  type  2  and  pre-defined 
objects  of  type  3  are  found  in  the  back  of  this  document.  Symbols 
of  type  1  are  identical  to  the  corresponding,  standard  PDP-11 
assembly  language  symbols  as  defined  in  [DEC  71]. 

Procedure  for  Getting  Started  with  DAME 

To  run  DAME,  enter  the  following  command  to  the  PDP-10 
Monitor : 

•  P.UN  DAME  C410BA07 

It  will  respond  with: 

DAM El 1/10. . . 

*  * 

and  unlock  the  keyboard.  You  are  now  in  DAM E  command  mode, 
indicated  by  the  do ub 1 e -a s t e r i sk  prompt  signal. 

(Notation:  A  BNF-like  notation  is  used  to  describe  the 

syntax  of  DAME  instructions.  "/"  indicates  disjunction,  and 
and  ">"  delimit  non-terminal  symbols.  Brackets  "T"  and 
delimit  optional  operands.) 

Instruction  Format 

•-MM  inst  ruction  >  ->  <Type-l  instruction>  /  -Tvpe-2  instruction- 
•'Type-1  inst  rue  t  ion  >  coperator '(••  operand  list-) 

-Tvpe-2  instruction*  *  ^operator  * (-operand  list-  <action>) 

-operand  list-  -*  -operand* /  -operand  list*  -operand- 

-operand-  ►  -octal  integer-  /  9  -  octal  integer-  /  short  char,  string- 
•global  -10  symbol-  /  ^  M  M  0  name- 


<-action>  -»  <MM  routine  name>  •'compound  instructions 
■'Short  char-  strings  *■  '  <  u  p  to  5  characters> 

‘•compound  instruction^  ■*  (<MM  instruction  list>) 

■•MM  instruction  list-’  -*•  <MM  instruction>  /  <MM  instruction  list  ■ 

■•MM  instruction> 

As  can  be  seen,  some  monitor  instructions  take  simple 
operand  lists  while  others  (in  particular,  IF,  INCR,  WHL,  HOOK 
and  ALONG  instructions)  can  optionally  take  a  compound-instruction 
(the  analogue  of  a  compound  statement  or  compound  expression 
in  block -oriented  languages)  as  the  last  operand.  All  operands 
of  an  MM  instruction  must  be  defined  prior  to  the  execution  of 
that  instruction.  MMO  '  s  which  are  not  pre-defined  by  the 
system, are  defined  by  the  CREATE  instruction  (except  for  monitor 
routines,  hooks  and  value-trace  objects,  as  described  later.) 

The  form  @  octal  integer-*  refers  to  the  contents  of  -11  core 
location  <octal  integers  when  the  instruction  is  executed. 


MONITOR  MAC1I  INF.  INSTRUCTION  S  T  T 


Comma  mis  for  Croat  i  n  >’.  Monitor  Rout  f  nos 

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAiAAAAAAAAAAAAAAA 

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

"Load  Monitor  Routine"  Command 

S  v  n  t  1  x  :  LM K  ('tile n a nt o  s  •  r  o tt  t  i  no  s  p o  c  .  • ) 

'tout  1  no  spoo.'-  *  ''routine  name'/'* 

Kfteot  :  There  must  ho  an  SOS  tile  named  •  filename  .H1N,  where 
f  i  I  e  name  has  at  most  five  characters.  The  till'  must  contain 
Monitor  routines  in  the  following  format  : 

•  rout  i  n o  n a m e  (  MM  Inst  r tt c  t  i  on  ■  •  MM  ins;  r  net  i  on 


. ) 

routine  name  (  MM  i  its  t  rue  t  1  on  ■  . 


) 


i.e.  each  routine  must  start  on  a  new  SOS  line. 

Standard  S('S  line  number  I  nr,  is  assumed.  If  '*  is  specified,  all 
the  routines  in  the  file  are  loaded  and  def  ined  as  tlMO's. 
Otherwise,  if  the  specified  routine  is  found  in  the  file,  if  is 
I  oit  tii'd  and  defined,  else  an  error  message  is  typed. 

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 


"Define  Monitor  Rout  Ine"  Command 
S  v  n  l  a  x  :  DM K  (''routine  n  ,11110  ■ ) 


lit' feet  :  routine  name  must  ho  at  most  !>  characters.  The 

command  puts  the  user  in  the  DAMT  edit  mode,  which  is  indicated 
bv  the  prompt  characters  for  the  first  line  of  a  routine 

bo  inn  defined.  If  the  routine  is  1  o  ext  etui  inti’  more  lines. 


terminate  each  non- t e rm i na 1  line 
return;  DAMT,  will  prompt  with  1 
after  the  t  i  r  s  t  line.  To  1  tit  i  n  a  t  o 


with  an  alt  mode  and  earrlaRO- 
tor  each  non- tormina l  line 
last  line  v  i  t  It  only  a  carriage 


i  59 


*********************************************  *  ***************** 
*************************************************************** 

PDP-11  Flow  Control  Commands 


***** i 

*************************************************************** 


"Run"  Command 


Syntax:  RUN  ( [  <  s  t  ar  t  i  n  g  address  Ohalt  count-'JJ) 

Effect:  If  <starting  address^  is  specified,  it  is  inserted  in 

the  PDP-11  PC.  If  halt  c  o  u  n  t  >  is  specified,  it  is  inserted  in 
the  global  variable  HALTCOUNT,  the  value  of  which  is  initialized 
to  -1  when  the  system  is  started  up.  The  CPU  is  then  given  control, 
starting  with  an  instruction  fetch  from  the  current  value  of  PC. 
HALTCOUNT  is  decremented  by  1  after  the  completion  of  every 
instruction.  When  it  reaches  zero,  execution  is  stopped  and 
command  mode  is  entered. 


*****★**********: 


"Go"  Command 

Syntax:  G  0 ( [  <  h  a 1 1  count-  1 ) 

Effect:  If  <  ha 1 1  count>  is  specified,  it  is  inserted  in 

HALTCOUNT.  Execution  is  resumed  from  its  current  state. 


"Stop"  Command 
Syntax :  STOP (  ) 

Effect:  CPU  is  stopped  and  command  mode  is  entered. 

"Stop  Comditional"  Command 
Syntax:  STOPC(^id>) 

Effect:  If  the  value  of  -id^  is  odd,  then  same  as  STOP  (  ), 

otherwise  no  effect. 


*************************************************************** 


"Node"  Command 

Syntax:  N  0  D  E  (  '  <  n  o  i  e  name  >  < lower  bound>  <upper  bound*) 

Effect:  Defines  a  node-object  with  name  <node  n  a  m  e  >  and  whose 

scope  is  '■lower  bound>  to  < upper  bound>.  See  the  format  of 
objects  of  nodesubclass  and  also  NODETRACE  table  for  the  format 
of  nod e- i n s t an c e s  (p.  41  and  49). 

*************************************************************** 

"Node  Trace"  Command 

Syntax:  NTR() 

Effect:  This  command  causes  the  system  to  assume  the  default 

mode  for  node  definition.  The  first  executed  instruction  starts 
the  first  node  and  first  node  instance.  Thereafter,  every 
conditional  branch  and  every  deviation  from  seouential  flow 
causes  the  termination  of  the  current  node  instance,  and  the 
following  instruction  (i.e.  the  target  of  the  transfer)  consti¬ 
tutes  entry  into  a  new  node  instance.  The  current  node  instance 
is  also  terminated  when  a  preciously-established  end  of  a  node 
instance  is  encountered  even  if  control  flow  remains  sequential. 

*************************************************************** 

"Along"  Command 

Syntax:  ALONG ( NO  Nl...Nk  P) 

Effect:  Ni's  must  be  the  names  or  starting  addresses  of  nodes 

and  R  a  compound-instruction  or  the  name  of  a  monitor  routine. 

Whenever  the  execution  follows  path  NO, Ml . Nk  ,  R  is  executed 

whenever  this  ALONG  command  is  encountered.  More  precisely,  let 
L0,Ll,...,Lt  be  the  sequence  (in  reverse  chronological  order) 
of  nodes  executed  so  far,  with  L0=  the  current  node.  Then  R  will 

be  executed  if  and  only  if  for  some  j,  0  E  j  <  k  for  all  i  =  0 . j  , 

Ni=L(j-i);  i.e.  if  some  ( j +1 ) - e 1 emen t  initial  segment  N0,N1 . Nj 

in  the  specified  path  is  identical  to  L(j),L(j-l),...,L0,  the 
last  j+1  node  instances  executed. 

*************************************************************** 

"Restore  to  Node  Instance"  Command 

Syntax:  REST(-'index  ) 

index>  >  'octal  integer  >  /  'obj.  name 


161 


Effect:  The  PDP-11  environment  which  existed  where  the  node 

instance  specified  by  <index>  was  entered  is  restored,  including 
the  NODETRACE  table  and  the  instruction  count  ICOUNT. 

However,  simulation  time  is  not  restored. 

"Replay  Node  Instance"  Command 


Syntax:  RPLAY([  *  T  J  <starting-index>  L<ending-index>]) 

Effect:  The  input-sets  of  the  node  instances  back  through  the 

instance  of  index  <starting  index>  and  a  replay  is  made  of 
the  ncde-instances  specified  by  <starting-index>  thru  ending- 
index  .  (A  node  instance  has  in (5 ex  i  if  j  t  is  the  ith  node 
instance  entered  since  the  first  node  was  defined.  The  indices 
of  node  instances  can  be  determined  via  the  Find  Node  Instance 
(FNI)  command.)  At  the  end  of  the  replay,  the  PDP-11  state 
which  existed  when  the  RPLAY  command  was  issued  is  restored, 
including  the  NODETRACE  table,  instruction  count  and  simulation 
time.  If  T  is  specified,  the  instructions  are  traced  on  the 
TTY  as  they  are  executed. 


162 


*************************************************************** 

*************************************************************** 

Monitor  Routine  Flow  Control  Commands 

*************************************************************** 

*************************************************************** 

"If"  Comma  n  d 

Syntax:  IF('opdl-  '<rel>  opd2  -then-action  f  else-action  ) 

•then-action  *  -action  > 

■'else-action  '  >  -action  • 

action^  -  -routine  name'  /  compound  instruction 

•compound-instruction'-  -*  (MM  instruction  list') 

MM  instruction  list  -  -MM  instruction 

/  <MM  instruction  list-  MM  instructions 

■  rel  ■  *  EQ/NEO/GE/GT/LE/LT 

Effect:  If  the  specified  relation  holds  then  the  action 

■then-action-*  is  executed.  Otherwise,  if  an  -else-action 
has  been  specified,  it  is  executed. 

*************************************************************** 

"While"  Command 

Syntax:  W  H  L ( • o  pi  >  •action') 

Effect:  The  action  action'  is  executed  while  the  value  of 

•opd>  is  odd.  -action--  is  defined  as  above. 

*************************************************************** 
"Incr"  Command 

Syntax:  IN’CR(<var-  -from-opd-  < to-opd  step-opd  "action  ) 

Effect:  As  the  value  of  -var  is  incremented  from  from-opd- 

to  at  most  -to-opd  in  steps  of  step-opd',  -action’  is  executed 
at  each  step.  If  <from— opd  is  initially  smaller  than  to-opd  , 
-action  is  not  executed  at  all.  action'  is  defined  as  above. 


163 


"Execute"  Command 
Syntax:  EX ( < r o u t i ne > ) 

Effect’:  The  monitor  routine  <routine>  is  executed.  This  command, 

together  with  the  PUSH,  POP  and  RET  commands  described  below, 
constitute  a  subroutine  facility  with  call-by-value  parameters. 

"Push"  Command 


Syntax:  PUS H ( < va lue > ) 

(value>  -  <  o  c  t  a  1  integer*  /  '<char.  const,  up  to  5  chars. > 

/  <ob j  .  name  > 

Effect:  The  provided  literal  or  the  contents  of  word  0  of 

v  o  b  j .  na  me  >  are  pushed  on  a  (implied)  stack  from  where  they 
can  be  retrieved  by  a  POP  command. 

"Pop"  Command 
Syntax:  POP(<obj.  id>) 

Effect:  The  last  element  pushed  onto  the  stack  is  popped  into 

word  Oof  *■-  o  b  j  .  id>  . 


******** 


"Return"  Command 
Syntax:  RET(<level  count  ) 

Effect:  Causes  an  exit  from  the  last  <level  count>  number  of 

monitor  routines  and  compound-instruction  levels;  the  level 
count  for  current  level  being  zero.  (Note  that,  in  fact,  RET(O) 

It  C3Se  slnce  it:  means  that  the  MM  instructions  following 

he  RET ( 0 )  in  the  same  level,  will  never  be  executed.  The  effect 
of  that  level  would  remain  unchanged  if  the  RET (0)  and  all  the 
following  instructions  in  the  same  level  were  removed.) 


164 


*************************************************************** 
"Tvpe-Out"  Commands 


"Type  Object"  Command 
Syntax:  TOBJ(<obj.  name  or  address^) 

Effect:  Types  the  contents  of  the  object  whose  name  or  -10 

address  is  given,  at  the  terminal  in  a  format  appropriate  to 
the  class  of  the  object. 

List-objects  are  typed  between  a  pair  of  brackets, 

Each  element  of  the  list  is  also  typed  according  to  these  same 

rules,  recursively. 

Pepresentative-objects  are  indicated  by  a  k  followed  by  a 
recursive  type-out  of  the  object  they  represent. 

Numeric-variable  objects  are  typed,  for  an  object  named 
ABC,  as  'ABC:  '  followed  by  the  contents  of  ABC  where  each  word 
is  typed  in  PDP-10  numeric  half-word  format  and  words  are  sepera- 
ted  by  slashes.  The  last  word  is  followed  by  two  spaces. 

Character-variables  are  typed  in  the  same  format  as  numeric- 
variable  objects,  except  that  each  user  word  is  interpreted  as 
a  left-justified  character  string  and  tvped  out  as  such. 

Numeric-constant  and  character-constant  objects  are  typed 
in  a  format  similar  to  those  of  the  corresponding  variables 
except  that  no  name  is  typed.  Lon g- c ha r a c t e r -cons t an t  objects 
are  typed  without  the  slashes  between  user  words. 

(There  are  two  classes  of  objects  which  are  not  normally 
used  by  the  user.  These  are  included  here  only  for  completeness. 
Id-objects,  which  represent  names  in  a  monitor  instruction,  are 
typed  between  <...>.  Non-homo genou s  objects  are  typed,  for  an 
n-word  object  by  interpreting  user  word  i  as  an  object  class 
and  tvping  out  user  word  i+1  according  to  that  class  followed  by 
a  "olon  ,  i  =  0 , 2  ,  .  .  .  ,  n-2  .  These  are  used  in  the  Svmbol  Table  to 

represent  entries.) 

For  an  object  whose  class  is  something  other  than  one  of 
the  above,  an  error  message  is  typed  indicating  the  class  of  the 
object.  (For  a  list  of  object  classes,  see  Create  Object  Command 


Note  that  TOBJ  command  must  be  used  to  type  onlv  MM  objects 


Every  completed  type-out  is  followed  by  a  c a r r i ag e- r e t u r n , 
line-feed. 

*************************************************************** 

"Ty pe - I nd i r e c t  Object"  Command 
Syntax:  TIOBJ ( <po inter >) 

Effect:  Performs  "Type  Object"  Command  on  the  object  pointed 

by  -pointer>.  This  command  is  especially  useful  for  typing  out 
objects  pointed  by  global  PDP-IO  symbols,  by  giving  the  -10 
symbol  as  the  <pointer>.  See  the  list  of  global  variables  at 
the  end  of  this  appendix. 

X'k'k'k’k’kit'k'k'kjcjclc’k-k-kjc’k’kjc'k'k'k'k'k'k'k’k'k'k'k'k'kjt'k'k'k'k’k'k'k'k’k’k’k’k’k’k'k'k'k’k'k’k'k'k'k'k’k'k'k'k'k 
"Type  -10  Symbol"  Command 
Syntax:  TY10(<global  var.  name  or  address>) 

Effect:  The  contents  of  the  specified  global  variable  or  the 

-10  address  is  typed  out  in  octal  half-word  format,  followed  by 
two  spaces . 

*************************************************************** 

"Type  Contents  of  -11  Addresses"  Command 

Syntax:  T(<starting  address'-*  [  <  ending  address*!) 

Effect:  Types  out  the  contents  of  -11  core  from  ^starting  address 

to  < ending  address>.  Either  term  may  be  a  constant  or  an  object 
whose  word  0  contains  the  address.  If  the  latter  is  omitted, 
it  is  taken  to  be  equal  to  the  former.  For  each  core  word,  the 
type-out  has  the  form: 

<MM0  list  ptr>,  I/M  bits>,-ll  word>. 

The  first  field  is  the  18-bit  -10  addvess  of  the  list  of  MMO ’ s 
associated  with  that  -11  location,  e.g.  hooks,  value-traces, 
node-objects  etc.  These  may  be  examined  by  entering  TOBJ(--MMO 
list  ptr>).  See  "Object  Subclasses"  for  the  format  of  each 
such  object. 

■^I/M  bits*  are  used  in  the  determination  of  Input-Output  sets, 
and  are  not  of  direct  interest  to  the  user. 


I 

I 


166 


Each  word  is  followed  bv  two  spaces, 
to  a  line. 


Words  are  written  eight 


"Type  Immediate"  Command 


Syntax:  TI ( < li te ral > ) 

"literal:  >  non-neg.  octal  integer>  / 

v  c h a r  .  string  up  to  4  chars. > 

Effect:  Types  out  the  supplied  literal. 

t************************************************************** 

"Type  Node  Instances"  Command 


Syntax:  TNI([< starting  index>] 


count>) 


Effect:  counts  number  of  node  instances  starting  with 

starting  index>  are  typed  on  the  TTY  (moving  forward  in  time 
it  count'  is  positive,  otherwise  moving  backward  in  time). 

f  ;fartin8;index>  is  omitted,  it  is  taken  to  be  the  setting 
of  the  node  instance  pointer  NIP.  The  format  of  each  typed 
instance  is  (typed  on  one  line)  : 


index  -  <node  address  >  --'flags 
input-set  address'’  <output  —  set  address> 

"no.  °f  instructions  in  the  node  instance> 

"Type  Node  Objects"  Command 
Syntax:  TNO(<al>  <a2>...  <an>) 

the  node  objects  associated  with  PDP-11  addresses 


Effect:  Types 

a  1 ,  .  .  .  ,  a  n  . 


167 


*************************************************************** 


"Insert"  Commands 


************* 
'  *  *  * 


"Insert  in  -11  Address"  Command 
Syntax:  I('  address"*  <value  >) 

Effect:  Each  operand  may  be  a  constant  or  an  object  name.  In 

the  latter  case,  the  contents  of  word  0  of  the  object  is  used 
as  the  -11  address  or  the  value.  If  the  value  is  less  than 
177777,  the  control  bits  (bits  16-35)  of  the  core  are  unaffected 
and  the  value  is  placed  in  the  -11  word.  Otherwise  the  full 
-10  word  is  replaced  by  the  value. 


"Zero  -11  Addresses"  Command 


Syntax:  Z(- starting  address  >  <  e  n  d  i  n  g  address'*) 


Effect  : 
The  -11 


Either  operand  may  be  a  constant  or  an  object 
words  between  the  specified  objects  are  set  to 


name  . 
zero. 


"insert  in  Object"  Command 
Syntax:  I0BJ(obj.  name  •-  N  "•  •'value) 

Effect  :  The  <value  ■  is  inserted  in  word  <N>  of  the  object 
obj.  name  .  Either  <N  or  •  v a 1 u e >  may  be  a  constant  or  an 
object  name.  value  may  be  an  octal  constant  or  a  character 
constant  of  at  most  5  characters  preceded  by  the  single  quote 
If  ••'value  *  is  the  name  of  an  object  whose  subclass  is  1  3 
(ADDR1 1  SUBCLASS )  ,  its  contents  are  taken  to  be  an  -11  core  address, 
and  the  contents  of  that  address  are  used  as  the  value. 

*************************************************************** 

"Insert  Halfword"  Command 

Syntax:  I H  W ( <  o  b  j .  id  start-address  f<n>]) 

Effect:  The  nth  halfword  in  the  -10,  counting  left  to  right, 

starting  with  the  1 e f t -h a  1 f wo r d  of  <start-address>  ,  is  inserted 
in  right  half  of  obj.  id  . 


■k 

k 


•kkk’k’kk’k’k'kk'k'k'kjc'k  * ’kk'k'k'k'k'kk'k’k'kk'k'k'k'k 
kkk'k-kkk'k’k’k'k'kicn-k’kkkkkk'k'k’k'k'kk'kk'k'k'k 


Commands  to  Create  and  Delete  Objects 

*************************************************************** 


"Create  Object"  Command 

Svntax:  C  R  (  1  ••  o  b  j  .  name  class  subclass  size-)) 

Effect:  Creates  and  object  according  to  these  specifications. 

I  I  only  Lne  first  operand  is  .pecified,  the  default  values  for 
the  other  3  operands  are  used.  These  are  *100  (numeric  constant 
class),  0  (free  subclass)  and  1  (1  user  word).  All  specified 

operands  must  be  constants.  o  h  j  .  name-  must  have  at  most  5 
characters  . 

The  object  classes  which  the  user  mav  use  are: 

100.  numeric  class 

300:  character  class  (up  to  5  characters) 

700:  lonp.-character-strinp  class 

The  object  subclasses  which  the  user  may  use  are: 

0  :  free  subclass  (i.e.  uninterpreted) 

13  :  PDP-11  address  subclass  (whenever  the  object  is  encounte 

red,  the  contents  of  the  PDP-11  word  pointed  bv  it 
are  taken) 

14  ;  PDP-10  address  subclass  (whenever  the  object  is  encounte 

red,  the  contents  of  the  PDP-10  word  pointed  by  it 
are  taken ) 

These  classes  and  subclasses  are  that  subset  of  all  the  pre-defined 
classes  and  subclasses  which  should  be  visible  to  the  user.  There 
are  many  others  which  are  used  bv  DAME  and  POOMAS  functions.  The 
user  may  create  objects  with  classes  and  subclasses  other  than 
those  pre-defined.  In  such  objects  the  classes  assigned  sh.  ould 
be  between  octal  1000  and  77770  and  subclasses  between  octal  70 
and  77770  in  order  to  avoid  conflicts  with  the  pre-defined  ones. 
Objects  wfith  such  user-defined  classes  may  not  be  tvped  out  with 
the  TOB.I  command. 

kk'k’k’k’k’kkkk'k’k’k’kk’kk'k'k'k-k’kk'k'kkk'k'kkk'k'kk’kkkk'k'k'k'k'k'k'k'k'k'k'k'k'k'k'kkk'kk'k'k'k'k'k'k 

"Delete  Object"  fommand 
Syntax:  DF.L(<obj.  name  or  address  >) 

Effect:  Deletes  the  specified  object  and  returns  its  space  to 

the  free-space  list. 


169 


*************************************************************** 

*************************************************************** 


Hook  Manipulation  Commands 


"Hook"  Command 

Svntax:  HOOK(hook  specification  ■>  ) 

hook  specification  *  general  hook  spec.  / 

■addressed  hook  spec . ^ 

•general  hook  spe':  *  '  pen.  hook  c  o  d  e  >  •action-’ 

hook  n a m e  > 

pen.  hook  code'  •  C.  F  /  G  S  /  IF/  1  C  /  N  E  /  N  X 

■  addressed  hook  spec  *•  '-addr.  hook  code  ■ 

■  action- 
lower  hound 
upper  bound 
'  hook  name 

■addr.  hook  code  •  A  1 F/ A1C/ AF/AS 

hook  name-  •  char,  string  up  to  5  chars  .  ■ 

•lower  bound  •  octal  integer’  /  regname> 

upper  bound  •  octal  integer  not  smaller  than 

lower  bound 

Effect:  The  HOOK  command  is  the  r  rincipal  means  by  which  the 

user  executes  monitor  routines  during  the  execution  of  his 
program.  General  hooks,  i.e.  those  with  codes  'OF,’ OS,’ IF,' IC, 
'OF,  'OS,  '  N  E  or  'NX  cause  the  execution  of  the  specified  monitor 
routine  at  every:  fetch,  store,  instruction  fetch,  instruction 

completion,  node  entry  or  node  exit  respectively. 

Addressed  hooks,  i.e.  those  with  codes  'AF,  'AS,  'MF  or  'AIC, 
cause  the  execution  of  the  specified  monitor  routine  whenever 
a  fetch,  a  store,  an  instruction  fetch  or  the  completion  of  an 
instruction  occurs  from  a  location  within  the  specified  bounds. 
If  register  names  are  used,  the  folio  wing  additional  rule  must 
be  observed:  for  general  registers  the  bounds  must  stav  within 

R  0  to  R7,  and  other  registers,  namely  TKB,  TKS,  TPB,  TPS  and  PS, 
must  be  specified  individually,  by  giving  the  same  name  for  both 
the  lower  and  upper  bounds. 


170 


'Disable  Hook"  Command 
Syntax:  DISAB(^hook-obj.  name  or  address^) 

Effect:  Causes  any  future  activations  of  the  hook  to  be  a 

no-op  . 


*************** 


"Enable  Hook"  Command 

Syntax:  EN'AB(<hook-obj.  name  or  address-) 

Effect:  Causes  the  monitor  routines  associated  with  the  hook 

to  be  executed  whenever  the  hook  is  activated. 


1  71 


******************* 
********* 


****** 

Commands  for  Searching  PDP-11  Fxecutlon  History 


"Find  Input-Set"  Command 

Svntax:  iISET(<obj.  Id  node-spec-  search- spec 

I  ‘direction  -  f-'starting  inde,>!J) 


node-spec 


' *  /  node -id 


search  - spec-  *  routine  n  a  in  e  > 

/  < compound-instruction 


direct  ion 


’  F  /  *B 


start! np  index 


positive  octal  i  n  t  e  g  e  r  •>  / 
■obj.  name  > 


Effect:  A  search  is  made  over  the  input-sets  of  past  node 

instances  until  one  satisfying  -search -spec  is  found.  If 
such  a  node  instance  is  found,  the  address  of  its  input-set 

)  St  1  ",Se  r  ^ ed  in  ob'-  id  anci  tHe  node  - instance  pointer  NIP 
(which  is  a  PDP-10  global  variable)  is  set  to  the  index  of 
the  node  instance;  otherwise  a  36-bit  -1  is  inserted  in 
obj.  id-  and  NIP  is  unaffected.  If  a  -node-id  is  provided 
in  '•node-spec  -,  only  the  instances  of  that  node  are  searched: 
i  is  specified  all  input  sets  are  searched.  If  a  -direction 

is  provided,  search  takes  place  in  that  direction  C ’ F  for 
forward  ,  '13  for  "backward");  otherwise  search  takes  place 

ackward.  If  a  -starting-index  is  provided,  search  starts 
from  that  index,  otherwise  it  starts  from  the  most  recent  node 
instance.  (Note  that  if  NIP  is  specified  as  - s t a r t i ng- i nd ex  , 
it  will  start  from  the  current  setting  of  NIP.) 

The  procedure  for  the  application  of  the  predicate,  i.e. 
•search  spec  ,  is  as  follows:  The  system  pushes  the  address  of 
the  input-set  to  be  tried  on  the  stack.  Hence,  the  routine  or 
compound  instruction  supplied  in  search-spec > ,  referred  to  as 
the  predicate  hereafter,  must  obtain  that  address  by  a  POP(A) 
instruction,  where  A  is  some  input-set  name.  Then,  the  contents 
of  an  address  0  in  the  input-set.  can  be  extracted  bv  the  "find 
value  instruction  FVAL (B  A  0)  which  will  insert  in  the  object 

by  A  if  Q  is 


(16  bit)  contents 
in  fact  in  that  set 


o  f 


-11  address  0 
else  -1 .  The 


in  the  set  pointed 
predicate  must 


obtain  tlu>  auil.'tils  n  I  all  ad  d  r  u  s  s  o  s  in  ibis  maunor  ami  I1'1  iMon-i 
normal  uritlinuM  U1  nr  o.  om|>  a  r  i  son  oporatloiis  on  thorn.  whloh 
i'  o n  s  t  i  t  n  t  o  ;>  t  ho  b oil  v  o  I  t  hr  p  t  o il  i  o  a  t  o  .  I  lion  I  i  u a  I  i  v  ,  i  I  too 
dos  troil  oonil  t  t  l  oils  .no  mot  (l.o.  I  In'  p  rod!  oat  o  Is  sat  1st  i  oil). 

.1  IMhSll(l)  otliorwlso  a  (TSIt(O)  must  ho  portormml.  I’pmi  o\!t 
trout  Mio  pro  il  l  o  a  t  n  ,  t  lu>  svslom  will  pop  tho  staok.  It  t  ho 
pop  poil  v  a  1  no  Is  I.  tho  I  ml  ox  ot  tho  uodo  i  ns  t  anon  I  n  s  t  so  a  toll  oil 
i.  I  I  I  ho  In  so  rt  oil  In  oh)  .  i  d  anil  tho  Inst  mot  (on  will  ho  lovml- 

„  at  oil.  ti  t  ho  tw  t  so  ,  It  tho  oml  ot  t  ho  no  do  l  moo  h  I  s  t  o  f  v  has  boon 

ro  noli  ml.  oh  |.  Id  will  ho  sot  to  I;  olso  tho  add  loss  ot  tho 

noxt  Input  -sot  to  ho  soa  to  hod  will  ho  push  oil  on  tho  staok  and 
l  h  o  o  v  o  I  o  i  o  p  r  o  a  t  o  d  a  o.  a  I  n  . 

I  \ a  mp I o  :  Supposo  wo  wish  to  t  i  ml  tho  most  loo  out  Input  sot 

o  t  in  hi  S  t  .1  il  0  i‘  o  t  undo  \  w  li  o  i  o  l  h  o  o  i  mi  t  o  n  t  s  o  t  add  i  o  is.  I  I1 0  0 

Is  jpofit  oi  than  tho  omit  outs  o  t  iddvoss  .’000.  I'iov  tilod  tho 
oh  loots  It,  \  and  V  havo  hoon  ptovlouslv  oroatod  and  tho  nodi' 

N  ptovlouslv  dot  lin'd  hv  a  nodo  ot  NTK  Institution,  tin  tollowluc. 
Inst  tint  I  o  n  s  h  o  n  I  d  do  this: 

K  l  SI  T(U  N  (  rol’lA) 

TV  A I  i X  A  I  POO  I 
I  VAI  (V  A  .:ooo,i 

I  !■'(  X  *  i : r  \  i  I'l’SII  (  1  )  )  i  IM'Sllill)  )  )  ^  ) 

I'll  l  fi  Instill,  Ill'll  will  insult  in  h  oitltoi  tho  addross  ot  t  ho 

input  sot  ot  tho  most  mount  Inst  attoo  ot  N  in  whloh  0IOOO  il'Ml' 

oi  t  lio  valuo  -I  It  tin  snob  input -sot  oau  ho  t  mind  . 

ntuuuumumuHnHinAHUAUunuaAMUuuiuni** 

"  V  I  ii  d  On  l  put  -Sot  "  ('mu m a  n  d 

Svntax:  SOSII'i  nb|.  Id  uodo-spoo  soatoh-spoo 

d  I  t  u  0  t  I  on  s  t  a  r t  I  m;  Indox  1 


t  t  toot  :  I'ho  s  a  mo  as  T  (SI 

l  a  t  hoi  t  ha  it  Input  sot  . 


o  x  o  o  p 


i  that  output  sots  aro  s  o  a  r  o  Ii  o  d 


"t  lud  Val’.io"  I'ommaml 

Sint  ix  :  KVA I  l  nh|.  Id  addioss  ot  I  0  sot  -II  add  loss  ) 

pttuot:  It  tho  spooitlod  II  addross  appoars  in  tho  sp,oi«lod 

I  o  sot,  t  Ii  o  n  l  Ii  o  o  o  n  louts  o  t  t  h  o  addross  l  it  that  sol,  ot  h  o  t  i  s 
|  ,  (  s  Iniiort  ml  In  oh  |  .  Id  . 


173 


"Find  Node  Instance"  Command 

Syntax  :  FNI(  obj.  id  node-spec-  instance-count 

I  starting -index  !  < direction-  I  I) 

node-spec  '  *  •  n o d e - i d  •  /  C U P N 0 D F 

instance -count-  •  -octal  integer  /  obj.  name 

Effect:  £ n  attempt  is  made  to  find  the  nth  instance  of  the 

node  specified  bv  node-spec-  where  n=- octal  integer  if  one 
is  supplied,  otherwise  contents  of  word  0  of  -obj.  name  . 

CURNODE  means  the  current  node.  If  a  •  start  inp-index  is 
specified,  the  search  starts  from  there,  otherwise  from  t h e 
current  node  instance.  NIP  is  a  valid  parameter  for  starting- 
index  .  If  a  direction  is  specified,  the  search  proceeds  in 
that  direction:  otherwise  it  proceeds  in  the  backward  direction. 

If  the  desired  node  instance  is  found,  its  index  is  inserted 
into  obj.  id  and  into  NIP.  Otherwise,  -1  is  inserted  into 
obj.  id  and  NIP  is  unaffected. 


'Find  Node  Object"  Command 
Syntax:  F N 0 ( •  o b j  .  id  -11  address  ) 

Effect:  If  -11  address  is  the  starting  address  of  a  node, 
the  aodress  of  the  node  object,  otherwise  -1,  is  inserted  in 
•obj.  id  . 


******************************************** 


" Va lue- t race "  Commands 


"Initialize  Value-trace"  Command 


Svntax:  IVT(*=-11  addr 


numbe  r 


obj.  n  a  m  e  >  ) 


Effect:  Creates  a  value-trace  object  with  name.  obj.  name  ■ 

with  enough  room  for  <number>  previous  values  and  puts  the 
object  in  the  MMO  list  of  the  specified  -11  address  or  register 
Note:  This  command  does  not  initiate  the  collection  of 

It  merely  creates  and  object  to  hold  those  values.  The 
of  values  is  initiated  by  the  VTH  command. 


values  . 
collec¬ 


tion 


*  *  *  *  *  1 


"Value-trace  Hook"  Command 
Syntax:  VTK('-11  addr.  or  reg.  name') 

Effect:  Causes  the  monitoring  of  values  stored  into  the 

specified  core  location  or  register  and  maintains  a  circular 
buffer  of  the  last  k  values,  unique  or  non-unique,  stored 
there  bv  the  PDP-11,  where  k  is  the  •number''  specified  in 
the  preceding  IVT  command  for  the  same  address.  i'll  accesses 
by  the  -11  to  write  into  the  specified  cell  (including  auto 
register  incrementation,  decrementation,  turning  on/off  bits 
in  the  condition  code  or  the  device  registers)  are  considered 
store  operations  and  cause  a  new  entry  in  the  value-trace. 


175 


*************************************************************** 

*************************************************************** 

Disk  1/0  Commands 

X************************************************************** 

*************************************************************** 

"Write  Disk"  Command 

Syntax:  WDSK(<obj.  id') 

Effect:  Will  write  (in  PDP-10  dump  mode)  on  disk  file  D  S  E  P. .  D  A  M 

the  contents  of  the  object  whose  name  or  address  is  piven  in 
■obj.  id'.  If  the  file  does  not  exist,  it  will  be  created; 
otherwise  its  old  contents  will  be  destroyed. 

*************************************************************** 

"Write-Indirect  Disk"  Command 

Svntax:  WIDSK(<address>) 

Effect:  Will  perform  WPSK(<obj.  idN)  where  '-address  >  contains 

a  pointer  to  <obj  .  id  .  This  command  is  particularly  useful 
for  writing  out  obje:is  pointed  by  PDP-10  symbols. 

*************************************************************** 

"Read  Disk"  Command 

Syntax:  P.DSKO'obj.  id-) 

Effect:  Will  read  a  36-bit  word  from  the  binary  file  USER. DAM 

(which  had  better  exist!)  into  the  object  '-obj.  id>. 


176 


************************** ************************************* 
*************************************************************** 

Miscellaneous  Commands 

*************************************************************** 

*************************************************************** 

"Load  PDP-11  Program"  Command 

Syntax:  LOAD('<file  name  -  --starting  address  ) 

Effect:  The  file  must  be  in  the  absolute  unpacked  output 

format  of  the  PAL^-li  assembler  or  MACX11  with  /I/A  switches , 
must  have  extension  .BIN  and  the  --file  name>  must  be  at  most 
5  characters.  --starting  address  must  be  an  even  octal  integ  r 
between  0  and  157776  -  X,  where  X  is  the.  length  of  the  program 
in  bytes. 

*************************************************************** 
Generalized  "Unary  Operation  with  Assignment"  Command 
Syntax:  UA('<  operation  --target>  -opd) 

-  operation"'  >  SUC  /  PRED  /  SAL  /  SIZE  /  ADDR  /  NOT 
‘target^  -  -  o  b  j  .  id 
•  opd  *  <ob  j  .  id  * 

Effect:  The  specified  unary  operation  is  performed  on  -opd> 

and  the  result  is  inserted  in  ‘•target*.  'SUC  and  'PRED  are 
the  successor  and  predecessor  functions,  respectively.  'SIZE 
returns  the  number  of  user  words  in  i  opd  -,  'ADDR  the  address 
of  opd-.  'SAL  the  address  of  the  secondary-attribute-list 
(SAL)  of  <opd>  and  'NOT  the  logical  complement  of  the  contents 
of  the  first  user  word  of  opd  . 

*************************************************************** 

Generalized  "Binary  Operation  with  Assignment"  Command 

Syntax:  BA ( '  ■ operation  ■  target  -  cpdl'  o  p  d  2  ) 

operation-  •  +  /  -  /  *  /  slash  /  AND  /  OR  XOR 

/  :  /  * 

where  <  s  Lash  is  the  integer  division  sign  "/". 


Effect:  The  command  performs  'target>  <-  <opdl>  'operation  > 

'  o  p  d  2  .  A/‘B  is  the  Bth  previous  value  of  -11  core  location 
A.  is  the  vector  index  operation  and  returns  the  contents 

of  word  '  o  p  d  2  of  '  o  p  d  1  .  No  hounds  check  is  made.  Any  or 

all  of  target'*,  <opdl  •  and  opd2>  may  be  octal  constants,  MMO 
names  or  global  variables.  A  constant  for  the  •  target>  is 
interpreted  as  a  -10  address;  for  the  others  as  a  literal. 

All  the  arithmetic  and  logical  operations  are  defined  the  same 
as  in  BLISS  -10.  In  the  case  of  #  operation,  a  value -trace 

hook  for  at  least  *  o  p  d  2  previous  values  must  have  been  placed 
on  opdl’  and  a  VTH  command  must  have  been  issued  (See  IVT 
and  VTH  Commands).  If  ■ opd2  ■  exceeds  the  number  of  values 
declared  to  be  kept  in  the  last  IVT  command  for  the  location 
< o p d 1 > ,  and  error  message  will  be  typed  out  and  no  assignment 
will  be  made.  If  iopd2  number  of  values  have  not  yet  been 
stored  into  the  specified  locacion,  the  half  word  //  7  7  7  7  7  7  will 
be  returned. 

*************************************************************** 

"Execute  External"  Command 

Syntax:  XX(<PDP-10  routine  name j  I'param.  list  I) 

param.  list  -  param  /  'param.  list>  param> 

p  a  r  a  m  >  -  literal>  /  ^identifier-’ 

Effect:  Calls  the  specified  routine  with  the  given  parameters. 

Caution:  If  identifiers  are  given  as  parameters,  their  addresses 
are  passed.  If  you  wish  the  contents  of  the  identifier  passed, 
include  an  additional  parameter,  namely  the  literals'.  (single 
quote  followed  by  a  dot  and  a  space)  before  each  such  parameter. 
This  convention  applies  onlv  to  this  and  to  the  EVAI,  command 
below. 

"Evaluate"  Command 

Syntax:  EVAL(* target  PPP-10  routine  name>  [  param.  list>J) 

Effect:  The  -10  routine  is  called  in  the  same  manner  as  in 
Execute  External.  The  only  difference  is  that  the  value 
returned  by  the  routine  is  stored  in  ■■  target  > .  The  value 
returned  by  a  routine  is  assumed  to  be  in  register  3,  following 
BLISS/10  convention. 


"Time"  Command 


Syntax:  T I M E ( < o b j  .  id  '■  scale  '  ■  t  v  p  e  >  ) 

<scale  ■»  M I  C S  /  MILS 

(for  microseconds  or  mi  1 1 i seconds  respectively) 
type •  -  FIX  /  FLOAT 

Effect:  Puts  in  word  0  of  o b j  .  id  the  current  value  of  the 

simulation  clock  according  to  the  given  specifications  (i.e. 
in  microseconds  or  milliseconds  and  in  fixed  cr  floating,  point). 

*********************************************  it  ***************** 

"Plot"  Command 

Syntax:  P 1 0  T  (  space  count'  '-char  ) 

■-Bp  ace  count  -  *  literal  /  identifier 

< space  count- 

Effect:  Types  c a r r  i  age- r e t u rn ,  line-feed,  followed  by 

space  count'  spaces  and  the  character  -char. 


WllW»iWWHl 


179 


A  LIST  OF  USEFUL  GLOBAL  PDP-10  S Y M B 0 T  S  AND  THEIP  CONTENTS 


SYMBOL  CONTENTS 


(For  addressed  fetch  hooks) 

AFHDATA  The  data  just  fetched 

AFHADDR  The  address  of  the  fetch 

(For  addressed  store  hooks) 

The  data  to  be  stored 
The  address  of  the  store 

Contents  of  Unibus  Data  lines 

Contents  of  Unibus  Address  lines 

Contents  of  Unibus  Control  lines 

Last  value  of  the  Program  Counter  (R7) 

A  unique  integer  between  0  and  octal  Ill- 
representing  the  current  opcode 
(See  next  table) 

The  assembly  language  mnemonic  for  the 
current  opcode 


ASHDATA 

ASHADDR 

DATA 

ADDR 

CONT 

OLDPC 

OPN 


OPC 


DSTRF.G 

DSTMODE 

DSTDATA 


SRCREG 

SRCMODE 

SRCDATA 

HALTCOUNT 


CURNODE 

CURNOBJ 

CISP 

COSP 


The  destination  register,  mode  and  operand- 
J  value,  respectively,  of  the 
\  most  recent  (including  the  current) 

1  s i ng 1 e -o p e r and  or  double-operand 
V.  instruction 

i'  The  source  register,  mode  and  operand-value 
I  of  the  most  recent  (including  the  current) 
|  double-operand  instruction 

V- 

Number  of  instructions  after  which  simulator 
will  stop  (normally  maintained  by  DAME 
but  may  be  set  by  user) 

PDP-11  address  of  the  current  node 
PDP-10  address  of  the  node  object  for  the 
current  node 

Pointer  to  current  input-set 
Pointer  to  current  output-set 


180 


THE  OCTAL 

VALUE 

OF  OPN 

FOR  EACH 

OPCODE 

(,OPN=  i  +  i  ) 

A 

0 

1 

2 

3 

4 

5 

6 

7 

0 

MOV 

MOVB 

CMP 

CMPB 

BIT 

BITB 

BIC 

BICB 

10 

BIS 

BISB 

ADD 

SUB 

CLP 

CL  F  B 

COM 

COMB 

20 

INC 

INCB 

DEC 

DECB 

NET 

NEOB 

ADC 

A  DC  B 

30 

SBC 

SBCB 

TST 

TSTB 

R  DR 

RDRB 

ROE 

ROLB 

40 

I  ASR 

ASRB 

ASL 

ASLB 

JMP 

SWAB 

No-op 

CLC 

50 

CL  V 

CLZ 

CLN 

No-op 

SFC 

SKV 

SEZ 

SEN 

60 

BR 

BN  E 

BEG 

BGF 

BLT 

BOT 

BLE 

BPL 

70  ; 

BMI 

BHI 

BLOS 

EVC 

B  VS 

BCC 

BCS 

Not 

100  1 

JSR 

RTS 

HALT 

WAIT 

RTI 

(break 

IOT 

used 

RESET 

1 

110 

EMT 

TRAP 

point 

trap) 

APPENDIX  B 


Syntax  of  MPL 

nodule  -  MODULE  name  =  e  ELUDOM 

block  *  BEGIN  blockbody  END  /  (blockbody) 

compoundexpression  *  BEGIN  expressionseouence  END 

blockbody  -  declarations:  expressionsequence 

declarations  declaration  /  declaration;  declarations 

expressionsequence  *  /  e  /  e;  expressionsequence 

e  -*  s  inpleexpress  ion  /  con  t  r  o  1  ex  p  r  e  s  s  i  on  /  name:  e 

s  imp  1  eexpr  ess  i  on  -*  plO  *  e  /  plO 

P10  *  p 9  /  plO  OR  p  9 

p?  *  p8  /  p 9  AND  p  8 

p  8  ♦  p  7  /  NOT  p  7 

p  7  •  p6  /  p6  rel  -tion  p6 

p6  *  p 5  /  -  p 5  /  p6  +  p 5  /  p6  -  p 5 

p5  -  p4  /  P5  *  p4  /  p  5  p4  /  p  5  MOD  p4 

pA  -  p3  /  pA  *  p3 

p3  ►  decimal  /  name  /  name  i el ist  /  e  (elist) 

/  e  ( )  /  block  /  conpoundexpressioi. 

elist  -*  e  /  elist,  e 

relation  *  EOL  /  NEO  /  LSS  /  LEO  /  OTP  /  GEO 

controlexpression  *■  conditionalexpression  /  loopexpression  / 

c ho  i  c e e x p r e s s i o n  /  escapeexpression 

conditionalexpression  -*  IF  e  THEN  e  /  IF  e  THEN  e  ELSE  e 

12  12 


182 


loopexpression  -  WHILF  e  DO  e 

1  2 

loopexpression  *  INCR  name  FROM  e  TO  e  BY  c  DO  e 

12  3  4 

cscapeexpression  ►  EXIT  level  escapevalue  I 

RETURN  escapevalue  /  L E A v E  name  escapevalue 

level  ►  /  e 

esc a  Devalue  *  /  e 

choiceexpression  -  SELECT  elist  OF  NS  FT  nexpress  i  onset  TF.SN 
nexpressionset  -  /  ne  /  ne  ;  nexpress ionset 

n  e  *  e  :  e 

declaration  *  routinedeclaration  /  allocaticndeclarat ion 
allocationdeclaration  *  allocatetype  idlist 
allocatetype  -  GLOBAL  /  LOCAL  /  OWN  /  EXTERNAL  /  LABEL 
idlist  •  id  /  idlist  ,  id 
id  *  name  /  name  d i men s  i o n 1 t s t 

dimens ionl ist  *  decimal  /  d i men s i on  1 i s t  ,  decimal 

routinedeclaration  -  ROUTINE  name,  (namelist)  -  e  / 

ROUTINE  name  =  e  / 

EXTERNAL  f 1 i st 

flist  *  name  /  flis-,  name 

name  *  letter  /  name  letter  /  name  digit 

letter  •  A  /  B  /  ...  /  7 

digit  *0  /  1  /  •>.  /  - 

decimal  *  digit  /  decimal  dipit 


